Tru64 UNIX 


Assembly Language Programmer's Guide 


Part Number: AA-RH9LB-TE 


August 2000 


Product Version: Tru64 UNIX Version 5.1 or higher 


This manual describes the assembly language supported by the Tru64™ 
UNIX compiler system. 


Compaq Computer Corporation 
Houston, Texas 


©2000 Compaq Computer Corporation 


COMPAQ and the Compaq logo Registered in U.S. Patent and Trademark Office. Alpha and Tru64 are 
trademarks of Compaq Information Technologies Group, L.P. 


Microsoft and Windows are trademarks of Microsoft Corporation. UNIX and The Open Group are 
trademarks of The Open Group. All Other product names mentioned herein may be trademarks or 
registered trademarks of their respective companies. 


Portions of this document © MIPS Computer Systems, Inc., 1990. 


Confidential computer software. Valid license from Compaq required for possession, use, or copying. 
Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software 
Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under 
vendor’s standard commercial license. 


Compaq shall not be liable for technical or editorial errors or omissions contained herein. The information 
in this document is subject to change without notice. 


THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY 
KIND. THE ENTIRE RISK ARISING OUT OF THE USE OF THIS INFORMATION REMAINS WITH 
RECIPIENT. IN NO EVENT SHALL COMPAQ BE LIABLE FOR ANY DIRECT, CONSEQUENTIAL, 
INCIDENTAL, SPECIAL, PUNITIVE, OR OTHER DAMAGES WHATSOEVER (INCLUDING WITHOUT 
LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION OR LOSS 
OF BUSINESS INFORMATION), EVEN IF COMPAQ HAS BEEN ADVISED OF THE POSSIBILITY 

OF SUCH DAMAGES AND WHETHER IN AN ACTION OF CONTRACT OR TORT, INCLUDING 
NEGLIGENCE. 


The limited warranties for Compaq products are exclusively set forth in the documentation accompanying 
such products. Nothing herein should be construed as constituting a further or additional warranty. 


Contents 


About This Manual 


1  Architecture-Based Considerations 


1.1 REGISTERS: bates anahbbelyete vergush bated hand Mee Mie Ge aes 1-1 
1.1.1 Integer ReQiSterS .......... cece cece eee eee eee eee teat eee eee eed 1-1 
1.1.2 Floating-Point Registers ............:ccceeeee eee e cnet e eee eee eee 1-2 
1.2 Bit and Byte Ordering ............c cece eee cece eee eee ee eae eee 1-2 
1.3 Addressing .w222 222. derbi tare haierianeseeniie en atkalde ete 1-3 
1.3.1 Aligned Data Operations ..............c cee eee eect eee tenet teens 1-4 
1.3.2 Unaligned Data Operations ...............ccceeee cece e eee eee eee 1-4 
1.4 EXCEDULONS. so xea 22s juneesngetdt huge weno eee en daeeeaecdaudaevae eet ed 1-5 
1.4.1 Main Processor Exceptions ............:::eceeeeee eee e eee teens 1-5 
1.4.2 Floating-Point Processor Exceptions ..........::.00ceeee seers 1-5 
2 Lexical Conventions 
2.1 Blank and Tab Characters .............cc:cceee cece eee eee eee eee e eee 2-1 
2.2 COMMONtS 23 estiaes idee ua beer ae eld) een chee iGies eee eee 2-1 
2.3 KONE EES: sicude deni danebihihs hate hintiad te ages nine daeataens 2-1 
2.4 COMSEANES «accion Wastikdcs lesan netted lis its 2-2 
2.4.1 Scalar Constants .............:cecee eee e cece eee e eee eae eae 2-2 
2.4.2 Floating-Point Constants ..............c:eceeeee teste ee eee eee 2-2 
2.4.3 String Constants: si ..ccsi veces nde peeries chbcidee es savtees ae 2-3 
2.5 Multiple Lines Per Physical Line ................:ccceeeeeeee eee eee 2-4 
2.6 Statements vin cs hanieetig ieee ceva ea evii Sy endeeenbk eee 2-5 
2.6.1 LODEIS: eceeie Nani sehwetiees he ctevt labels ietaeeeer dl aseediges 2-5 
2.6.2 Null Statements ........... cc cece eee e eee teeta etna eae eed 2-5 
2.6.3 Keyword Statements ............ccceecee eee eect e eee ee eae eed 2-5 
2.6.4 Relocation Operands .............:ccceecee eee ence tenet eee eae eed 2-6 
2.7 EXPIeSSLONS' vai2 ity .cedee deeb cooed Ash oet tenn heer nt beens dydaee iiss 2-8 
2.7.1 Expression Operators ..........::ecceecee eee eect eee teeta ened 2-8 
2.7.2 Expression Operator Precedence RUIES ..........:seeeee eee ees 2-9 
2.7.3 Data TPS 1 s.ccneerrei eine sdedatebhocnaede er paniedd deni bes 2-10 
2.7.4 Type Propagation in Expressions .............:::eeeeeee eee eee 2-11 
2.8 Address Formats ............::ccceeee ene eee tees eee ences eae ena eeed 2-12 


Contents _ iii 


3 Main Instruction Set 


3.1 Load and Store nstructiOnS .............ccee eee eee eee tenet eee 3-2 
3.1.1 Load Instruction Descriptions ..............:ccceceee eee 3-3 
3.1.2 Store Instruction Descriptions ..............:ecee cece eee eee eee 3-6 
3.2 Arithmetic | nstructions ............. ccc cece eee ee eee eect eee eae eee 3-8 
3.3 Logical and Shift Instructions .............. cece eeeee eee eee eee eee 3-14 
3.4 Relational INStructiOns ............ 0. :ce cece cece eee eee eee eae eee 3-16 
3.5 Move INStructiOnS ......... 0... ccee eee eect ene eee eee enna eee eta eeee 3-17 
3.6 Control INStrUCtiONS ......... 00. cece eee eee eee eee etna eee 3-18 
3.7 Byte-Manipulation Instructions ..............c cece eee ee eect eee eee 3-21 
3.8 Special-Purpose INStrUCtiONS ...........0 cece eee eee e ee eee eee eae 3-24 


4 Floating-Point Instruction Set 
4.1 Background Information on Floating-Point Operations ........ 


4-2 
4.1.1 Floating-Point Data Types .............cceeeceeeee cnet eee eee eee 4-2 
4.1.2 Floating-Point Control Register ............:.cceeeeeee eee eee 4-3 
4.1.3 Floating-Point Exceptions ............ccceeeee eee e eee eee eee eee 4-4 
4.1.4 Floating-Point Rounding Modes ............:::ceeeeeee eee eee 4-5 
4.1.5 Floating-Point Instruction Qualifiers ..............::.: eee 4-7 
4.2 Floating-Point Load and Store Instructions ............:.::eeeee 4-9 
4.3 Floating-Point Arithmetic Instructions ................:ceeeee eee 4-10 
4.4 Floating-Point Relational Instructions ................::eeeeee eee 4-13 
4.5 Floating-Point Move Instructions ...........:ccceeeeee eee e eee ee eee 4-14 
4.6 Floating-Point Control Instructions ............:.cceeeeeeee eee eee 4-15 
4.7 Floating-Point Special-Purpose Instructions ...............::0665 4-16 
5 Assembler Directives 
6 Programming Considerations 
6.1 Calling CONVentiOns ......... 0... c cece cece eee ee eee eee nea e eee 6-1 
6.2 Program Model ..........ccccccecceeeeeeeeeeeeeeeseeeeesseaeeeeenneeees 6-2 
6.3 General Coding Concerns ............ccceece eect cnet eee ee teat ens 6-2 
6.3.1 Register USC... cece cece eee ee eee tenes teat ene e eed 6-3 
6.3.2 Using Directives to Control Sections and Location 
COUNtErS vive eiiindieyenie ee hed ecdanes teddd eeeaneen ee dee 6-4 
6.3.3 The Stack Frame ........ 2. .:cccee eee ee ee eee tees eae seen nae enes 6-6 
6.3.4 Coding Examples .............::ccceece cece eee ee eae en eeeeaeenaaes 6-10 
6.4 Developing Code for Procedure Calls ...........:.c:eeceeeee eee eee 6-13 
6.4.1 Calling a High-Level Language Procedure ..............:..5 6-14 


iv Contents 


6.4.2 
6.5 


Calling an Assembly Language Procedure ..............:..55 6-15 
Memory Allocation ........ 0.0: cceeee eect eee eee eee eee ee teats 6-1 


A_ Instruction Summaries 


B 32-Bit Considerations 


B.1 Canonical FOrM ..........ccceeeee eee eee eee eee eee ene e eee nnae ees B-1 
B.2 Longword INStructiONS .............:cceeee cece eee teeter tee eee B-1 
B.3 Quadword Instructions for Longword Operations ............... B-2 
B.4 Logical Shift Instructions .............c ccc eee eee ee eee eee B-3 
B.5 Conversions to QUAGWOI ....... eee ee eee eee eee eee eee ete ee nae ees B-3 
B.6 Conversions to LONQWOIr ............cceeeee eect eee eee eee eeae ees B-3 
C_ Basic Machine Definition 
C.1 Implicit Register USe ..... 0... cece eee eee eee tees C-1 
C.2 Addresses ~.dctvcaacnes tthe uekavess eas sapiens Be cede tae iene S C-2 
C.3 Immediate ValUeS ......... 60. cece cette ee ete ened C-3 
C.4 Load and Store InstructiONS ............::ccceeeee eset eee eee eee eee C-3 
C.5 Integer Arithmetic Instructions ............. cc cee eee eee eee eee eee C-4 
C.6 Floating-Point Load Immediate Instructions ...............:..685 C-4 
C.7 One-to-One Instruction Mappings ............:ccceeeee eee eeeeeeaes C-4 
D PALcode Instruction Summaries 
D.1 Unprivileged PAL code Instructions .............:0c:eceeeee eee eee D-1 
D.2 Privileged PALcode Instructions ...........:.cceeeeeeee tenet eens D-1 
Index 
Examples 
6-1 N@nleat PrOcedUre: caste fiacuinatan iio atiaa dy acmtteee we eias 6- 
6-2 Leaf Procedure Without Stack Space for Local Variables ...... 6-12 
6-3 Leaf Procedure with Stack Space for Local Variables ........... 6-12 
Figures 
1-1 Byt@ OLderinG eis serv sg eorhives seater eee ee ech eee tee 1-3 
4-1 Floating-Point Data Formats .............ccceeee eect cnet eee teens 4-3 
4-2 Floating-Point Control Register ...............:cceeeeeeee cnet teens 4-4 


Contents v 


2-1 


Loto to bt bP beodobo ob bo deodod bob to tot to bt to ob bt to db ob bod tod 
NOOR WN +O 


ARR RR oe Ba RBH ASCO) G0! G9: C9: G0 Te): C0: G0: G0:<G) Go .C0) 69 G0:.G0) G9 Co 
|=s22 2 2BOONODDAWONMHAHASBHBBBA AAU OONDATHRWOND= 


vi Contents 


Sections and Location Counters for Nonshared Object Files ... 
Stack Organization .............c cece eect eee eee eee eens 
Default Layout of Memory (User Program View) .............+.5 


Backslash COnVentiOns ............::cecceeeee eee eect eee eee eee 
Expression Operators ...........:ccccee eee e eee ee eee e eee e eae eeae eed 
Operator Precedence ....... 66. c cee cce cece e ee eect eee een ee teat ens 
Data TYPOS asi iiaee aceete Senile cones vied banana a eee eed cana dy alien 
Address :F Or mats). sc cigeee ge sced iaddeet teeeetiedenyeedea cideeksties 
Load and Store Formats ............::ccceeeee eee eee e eee eee eae eee 
Load Instruction Descriptions ..............: cece cece eee eee eee 
Store Instruction Descriptions ..............: cece eee e eee eee eee 
Arithmetic Instruction Formats ............::ccceeeee cnet cnet eee eee 
Arithmetic Instruction Descriptions ...............:0ceeeeee eee ees 
Logical and Shift Instruction Formats .............:::c:eeeeeee ees 
Logical and Shift Instruction Descriptions ...............::::0665 
Relational Instruction Formats .............::c:ceeeeeeee tense eee eee 
Relational Instruction Descriptions ................ceeceeeee eee eee 
Move Instruction Formats ...........:0c:cccee eee e eee eee tenet eae eee 
Move Instruction Descriptions .............0:ccceeeee tenet eee eee eee 
Control Instruction Formats ...........::c:ceceeeee eee eens teat ees 
Control Instruction Descriptions ..............:cceeeeee cnet eee ees 
Byte-M anipulation Instruction Formats ..............::::eeeeee ee 
Byte-M anipulation Instruction Descriptions ..................645 
Special-Purpose Instruction Formats ...........0cceeeeeeeeeneeeees 
Special-Purpose Instruction Descriptions .............cceeeeeee ees 
Qualifier Combinations for Floating-Point Instructions ........ 
Load and Store| nstruction Formats ............:::::ceeeeee eee eee 
Load and Store| nstruction Descriptions ...............:::eeeeeee 
Arithmetic Instruction Formats ............::cceeeeeeeeee eee e teens 
Arithmetic Instruction Descriptions ...............::cceeeee eee 
Relational Instruction Formats .............::c:ceeeee eee ener eens 
Relational Instruction Descriptions ...............ccecceeeee eee eee 
Move Instruction Formats ...........:0c::eceeeee eee e eee teeta eens 
Move nstruction Descriptions .............0:cceeeee tenet eee eee eee 
Control Instruction Formats ...........::c:ceceeeeee eee eee ee nena ees 
Control Instruction Descriptions ..............:ccceeeee ee ee eee ees 
Special-Purpose Instruction Formats ...........0ccceeeeeeeeneeeees 
Control Register Instruction Descriptions ................::0:eee 
Summary of Assembler Directives ............0c cece e cece eee e eens 


Integer REQIStErsS 1.0... ....i cece eee eee nee eee eee eens eee ene e eed 6-3 
Floating-Point Registers .............cc cece cee eee cnet eee eee eae eee 6-4 
Argument Locations ......... 00. :.ccce eee cnet eee eee etna ee eae eas 6-10 
Main Instruction Set SUMMAry ........... 0c cece eee eee eee eee eee A-2 
Floating-Point Instruction Set Summary .............:::seeeeeeee A-7 
Rounding and Trapping Modes ............0:ceeeee eee nett eee e eee A-10 
Unprivileged PAL code Instructions .............:0c:eceeeeee eee eee D-1 
Privileged PALcode Instructions ...........::ccceeeeeeeee eee eee eee D-2 


Contents _ vii 


About This Manual 


This manual describes the assembly language supported by the Tru64™ 
UNIX compiler system, its syntax rules, and how to write some assembly 
programs. For information about assembling and linking a program written 
in assembly language, see the as(1) and 1d(1) reference pages. 


The assembler converts assembly language statements into machine code. In 
most assembly languages, each instruction corresponds toa single machine 
instruction; however, in the assembly language for the Tru64 UNIX compiler 
system, some instructions correspond to multiple machine instructions. 


The assembler’s primary purpose is to produce object modules from the 
assembly instructions generated by some high-level language compilers. As 
a result, the assembler lacks many functions that are normally present 

in assemblers designed to produce object modules from source programs 
coded in assembly language. It also includes some functions that are not 
found in such assemblers because of special requirements associated with 
the high-level language compilers. 


Audience 


This manual assumes that you are an experienced assembly language 
programmer. 


It is recommended that you use the assembler only when you need to 
perform programming tasks such as the following: 


¢ Maximize the efficiency of a routine — for example, a low-level 1/O 
driver — in a way that might not be possible in C, Fortran-77, Pascal, or 
another high-level language. 


e Access machine functions unavailable from high-level languages or 
satisfy special constraints such as restricted register usage. 


e Change the operating system. 
¢« Change the compiler system. 


New and Changed Features 


The major technical changes to the manual are as follows: 


¢ The following directives are no longer supported and their descriptions 
have been deleted from Chapter 5: .alias, .bgnb, .endb, .gjsrlive, 
.gjsrsaved, .livereg, .noalias, .ugen, and .vreg. 
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¢ Descriptions of the following new directives have been added to 
Chapter 5: .ident, .tlscomm, .tlsdate, and .tlslcomm. 


Organization 


This manual is organized as follows: 


Chapter 1 


Chapter 2 
Chapter 3 


Chapter 4 
Chapter 5 
Chapter 6 


Appendix A 
Appendix B 
Appendix C 


Appendix D 


Describes the format for the general registers, the special 
registers, and the floating-point registers. It also describes 
how addressing works and the exceptions you might 
encounter with assembly programs. 


Describes the lexical conventions that the assembler follows. 


Describes the main processor’s instruction set, including 
notation, load and store instructions, computational instructions, 
and jump and branch instructions. 


Describes the floating-point instruction set. 
Describes the assembler directives. 


Describes calling conventions for all supported high-level languages. 
It also discusses memory allocation and register use. 


Summarizes all assembler instructions. 
Describes issues related to the processing of 32-bit data. 


Describes instructions that generate more than one 
machine instruction. 


Describes the PAL code (privileged architecture library code) 
instructions required to support an Alpha system. 


Related Documents 


The following manuals provide additional information on many of the topics 
addressed in this manual: 


« Programmer's Guide 


¢ The Alpha Architecture Reference Manual, 3rd Edition 
(Butterworth-H einemann Press, 1SBN:1-55558-202-8) 


¢ Calling Standard for Alpha Systens 


¢ Object File’ Symbol TableF ormat Specification (This manual is available 
as an HTML or PDF document on the documentation CD-ROM; it is 
not available in hardcopy.) 


Icons on Tru64 UNIX Printed Books 


The printed version of the Tru64 UNIX documentation uses letter icons on 
the spines of the books to help specific audiences quickly find the books that 
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meet their needs. (You can order the printed documentation from Compaq.) 
The following list describes this convention: 


G 


S 
P 
D 
R 


Books for general users 

Books for system and network administrators 
Books for programmers 

Books for device driver writers 


Books for reference page users 


Some books in the documentation help meet the needs of several audiences. 
For example, the information in some system books is also used by 
programmers. Keep this in mind when searching for information on specific 
topics. 


The Documentation Overview provides information on all of the books in 
the Tru64 UNIX documentation set. 


Reader’s Comments 


Compaq welcomes any comments and suggestions you have on this and 
other Tru64 UNIX manuals. 


You can send your comments in the following ways: 


Fax: 603-884-0120 Attn: UBPG Publications, ZK O03-3/Y 32 
Internet electronic mail: readers _comment@zk3.dec.com 


A Reader’s Comment form is located on your system in the following 
location: 


/usr/doc/readers_comment.txt 
Mail: 


Compaq Computer Corporation 
UBPG Publications Manager 
ZK O03-3/Y 32 

110 Spit Brook Road 

Nashua, NH 03062-2698 


A Reader’s Comment form is located in the back of each printed manual. 
The form is postage paid if you mail it in the United States. 


Please include the following information along with your comments: 


The full title of the book and the order number. (The order number is 
printed on the title page of this book and on its back cover.) 


The section numbers and page numbers of the information on which 
you are commenting. 
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¢ Theversion of Tru64 UNIX that you are using. 
¢ 1f known, the type of processor that is running the Tru64 UNIX software. 


The Tru64 UNIX Publications group cannot respond to system problems 

or technical support inquiries. Please address technical questions to your 
local system vendor or to the appropriate Compaq technical support office. 
Information provided with the software media explains how to send problem 
reports to Compaq. 


Conventions 


xii 


file Italic (Slanted) type indicates variable values, 
placeholders, and function argument names. 


[| ] 

i In syntax definitions, brackets indicate items that 
are optional and braces indicate items that are 
required. Vertical bars separating items inside 
brackets or braces indicate that you choose one item 
from among those listed. 


In syntax definitions, a horizontal ellipsis indicates 
that the preceding item can be repeated one or 
more times. 


cat(1) A cross-reference to a reference page includes 
the appropriate section number in parentheses. 
For example, cat(1) indicates that you can find 
information on the cat command in Section 1 of 
the reference pages. 
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Architecture-Based Considerations 


This chapter describes programming considerations that are determined by 
the Alpha™ system architecture. It addresses the following topics: 


e Registers (Section 1.1) 

e Bit and byte ordering (Section 1.2) 
e Addressing (Section 1.3) 

¢« Exceptions (Section 1.4) 


1.1 Registers 


This section discusses the registers that are available on Alpha systems 
and describes how memory organization affects them. See Section 6.3 for 
information on register use and linkage. 


Alpha systems have the following types of registers: 
e Integer registers (Section 1.1.1) 


¢ Floating-point registers (Section 1.1.2) 


You must use integer registers where the assembly instructions expect 
integer registers and floating-point registers where the assembly 
instructions expect floating-point registers. If you confuse the two, the 
assembler issues an error message. 


The assembler reserves all register names (see Section 6.3.1). All register 
names start with a dollar sign ($) and all alphabetic characters in register 
names are lowercase. 


1.1.1 Integer Registers 


Alpha systems have 32 integer registers, each of which is 64 bits wide 
Integer registers are sometimes referred to as gneral registers in other 
system architectures. 


The integer registers have the names so to $31. 


By including the file regdef.h (USe #include <alpha/regdef.h>) in 
your assembly language program, you can use the software names of all of 
the integer registers, except for $28, $29, and $30. The operating system 
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and the assembler use the integer registers $28, $29, and $30 for specific 
purposes. 


Note 


If you need to use the registers reserved for the operating system 
and the assembler, you must specify their alias names in your 
program, not their regular names. The alias names for $28, $29, 
and $30 are Sat, $gp, and $sp, respectively. To prevent you 
from using these registers unknowingly and thereby producing 
potentially unexpected results, the assembler issues warning 
messages if you specify their regular names in your program. 


The $gp register (integer register $29) is available as a general 
register on some non-Alpha compiler systems when the -G 0 
compilation option is specified. It is not available as a general 
register on Alpha systems under any circumstances. 


Integer register $31 always contains the value 0. All other integer registers 
can be used interchangeably, except for integer register $30, whichis 
assumed to be the stack pointer by certain PALcode instructions. See 
Table 6-1 for a description of integer register assignments. See Appendix D 
and the Alpha ArchitectureH andbook for information on PAL code (Privileged 
Architecture Library code). 


1.1.2 Floating-Point Registers 


Alpha systems have 32 floating-point registers, each of which is 64 bits 
wide. Each register can hold one single-precision (32-bit) value or one 
double-precision (64-bit) value. 


The floating-point registers have the names $£0 to $£31. 


Floating-point register $£31 always contains the value 0.0. All other 
floating-point registers can be used interchangeably. See Table 6-2 for a 
description of floating-point register assignments. 


1.2 Bit and Byte Ordering 


A system's byte-ordering scheme, or endian scheme, affects memory 
organization and defines the relationship between address and byte position 
of data in memory: 


¢ Big-endian systems store the sign bit in the lowest address byte. 


e Littleendian systems store the sign bit in the highest address byte. 
Alpha systems use the littleendian scheme. Byte-ordering is as follows: 
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¢« The bytes of a quadword are numbered from 7 to 0. Byte 7 holds the sign 
and most significant bits. 


¢« The bytes of a longword arenumbered from 3 to 0. Byte 3 holds the sign 
and most significant bits. 


¢ The bytes of a word are numbered from 1 to0. Byte 1 holds the sign 
and most significant bits. 


The bits of each byte are numbered from 7 to 0, using the format shown 


in Figure 1-1. (Bit numbering is a software convention; no assembler 
instructions depend on it.) 


Figure 1-1: Byte Ordering 
Quadword 
Bit: 63 ... 5655 ... 4847 ... 4039 ... 3231...24 23.16 15.8 7.. 


sign ae most 
significant bits 
Longword 
Bit: 31... 2423 .. 1615. 


_ 


sign A, most 
significant bits 


pale 
Bit: 15. 


ial 


sign i most 
significant bits 


Byte 
Bit: 7 6 5 4 3 2 1 0 
most least 
significant bit significant bit 


ZK-0732U-Al 


1.3 Addressing 


This section describes the byte-addressing schemes for load and store 
instructions. (Section 2.8 describes the formats in which you can specify 
addresses.) 
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1.3.1 Aligned Data Operations 


All Alpha systems use the following byte-addressing scheme for aligned data: 


Access to words requires alignment on byte boundaries that are evenly 
divisible by two. 


Access to longwords requires alignment on byte boundaries that are 
evenly divisible by four. 


Access to quadwords requires alignment on byte boundaries that are 
evenly divisible by eight. 


Any attempt to address a data item that does not have the proper alignment 
causes an alignment exception. 


The following instructions load or store aligned data: 


Load quadword (1dq) 
Store quadword (stq) 

Load longword (1d1) 

Store longword (st 1) 

Load word (1dw) 

Store word (stw) 

Load word unsigned (1dwu) 


1.3.2 Unaligned Data Operations 


The assembler’s unaligned load and store instructions operate on arbitrary 
byte boundaries. They all generate multiple machine-code instructions. 
They do not raise alignment exceptions. 


The following instructions load and store unaligned data: 


Unaligned load quadword (uldq) 
Unaligned store quadword (ustq) 
Unaligned load |longword (u1d1) 
Unaligned store longword (ust 1) 
Unaligned load word (uldw) 
Unaligned store word (ustw) 
Unaligned load word unsigned (uldwu) 
Load byte (1db) 

Store byte (stb) 

Load byte unsigned (1dbu) 
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1.4 Exceptions 


The Alpha system detects some exceptions directly, and other exceptions are 
signaled as a result of specific tests that are inserted by the assembler. 


The following sections describe exceptions that you may encounter during 
the execution of assembly programs. Only those exceptions that occur most 
frequently are described. 


1.4.1 Main Processor Exceptions 


The following exceptions are the most common to the main processor: 


Address error exceptions occur when an address is invalid for the 
executing process or, in most instances, when a reference is made toa 
data item that is not properly aligned. 


Overflow exceptions occur when arithmetic operations compute signed 
values and the destination lacks the precision to store the result. 


Bus exceptions occur when an address is invalid for the executing 
process. 


Divide-by-zero exceptions occur when a divisor is Zero. 


1.4.2 Floating-Point Processor Exceptions 


The following exceptions are the most common floating-point exceptions: 


Invalid operation exceptions include the following: 

- Magnitude subtraction of infinities, for example, (HNF) - (HNF). 
- Multiplication of 0 by INF with any signs. 

- Division of 0 by Oor INF by INF with any signs. 


- Conversion of a binary floating-point number to an integer format, 
that is, only in those cases in which the conversion produces an 
overflow or an operand value of infinity or NaN. (The cvttgq 
instruction converts floating-point numbers to integer formats.) 


- Comparison of predicates that have unordered operands and involve 
Less Than or Less Than or Equal. 


- Any operation on a signaling NaN. (See the introduction of Chapter 4 
for a description of NaN symbols.) 


Divide-by-zero exceptions occur when a divisor is Zero. 


Overflow exceptions occur when a rounded floating-point result exceeds 
the destination format’s largest finite number. 
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e Underflow exceptions occur when a result has lost accuracy and also 
when a nonzero result is between +2E™In (plus or minus 2 to the 
minimum expressible exponent). 


e |nexact exceptions occur if the infinitely precise result differs from the 
rounded result. 


For additional information on floating-point exceptions, see Section 4.1.3. 
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Lexical Conventions 


This chapter describes lexical conventions associated with the following 
items: 


¢ Blank and tab characters (Section 2.1) 
¢ Comments (Section 2.2) 

¢ Identifiers (Section 2.3) 

¢ Constants (Section 2.4) 

e Physical lines (Section 2.5) 

e Statements (Section 2.6) 

e Expressions (Section 2.7) 

e Address formats (Section 2.8) 


2.1 Blank and Tab Characters 


You can use blank and tab characters anywhere between operators, 
identifiers, and constants. Adjacent identifiers or constants that are not 
otherwise separated must be separated by a blank or tab. 


These characters can also be used within character constants; however, they 
are not allowed within operators and identifiers. 


2.2 Comments 


The number sign character (#) introduces a comment. Comments that start 
with a number sign extend through the end of the line on which they appear. 
You can also use C language notation (/*.. .*/) todelimit comments. 


Do not start a comment with a number sign in column one; the assembler 
uses cpp (the C language preprocessor) to preprocess assembler code, and 
cpp interprets number signs in the first column as preprocessor directives. 


2.3 Identifiers 


An identifier consists of a case-sensitive sequence of alphanumeric 
characters (A-Z, a-z, 0-9) and the following special characters: 


¢ . (period) 
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e _ (underscore) 


e § (dollar sign) 


Identifiers can be up to 31 characters long, and the first character cannot 
be numeric (0-9). 


If an undefined identifier is referenced, the assembler assumes that the 
identifier is an external symbol. The assembler treats the identifier likea 
name specified by a .glob1 directive (see Chapter 5). 


If the identifier is defined to the assembler and the identifier has not been 
specified as global, the assembler assumes that the identifier is a local 
symbol. 


2.4 Constants 


The assembler supports the following constants: 
¢ Scalar constants (Section 2.4.1) 

¢ Floating-point constants (Section 2.4.2) 

¢ String constants (Section 2.4.3) 


2.4.1 Scalar Constants 


The assembler interprets all scalar constants as two's complement numbers. 
Scalar constants can be any of the digits 0123456789abcdefABCDE F. 


Scalar constants can be either decimal, hexadecimal, or octal constants: 


¢ Decimal constants consist of a sequence of decimal digits (0-9) without 
a leading zero. 


e Hexadecimal constants consist of the characters Ox (or 0X) followed by a 
sequence of hexadecimal digits (0-9abcdefABCDEF ). 


¢ Octal constants consist of a leading zero followed by a sequence of octal 
digits (0-7). 


2.4.2 Floating-Point Constants 


Floating-point constants can appear only in floating-point directives (see 
Chapter 5) and in the floating-point load immediate instructions (see 
Section 4.2). Floating-point constants have the following format: 
+d1[.d2] [e|E+d3] 

dl 


A decimal integer that denotes the integral part of the floating-point 
value. 
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d2 


A decimal integer that denotes the fractional part of the floating-point 
value. 


d3 


A decimal integer that denotes a power of 10. 


The +symbol (plus sign) is optional. 


For example, the number .02173 can be represented as follows: 


21.73E-3 


The floating-point directives, such as .float and .double, may optionally 
use hexadecimal floating-point constants instead of decimal constants. A 
hexadecimal floating-point constant consists of the following elements: 


[+|-]0x[1]0] .<hex-digits>hOx<hex-digits> 


The assembler places the first set of hexadecimal digits (excluding the 0 or 
1 preceding the decimal point) in the mantissa field of the floating-point 
format without attempting to normalize it. It stores the second set of 
hexadecimal digits in the exponent field without biasing them. If the 
mantissa appears to be denormalized, it checks to determine whether the 
exponent is appropriate. Hexadecimal floating-point constants are useful for 
generating |EEE special symbols and for writing hardware diagnostics. 


For example, either of the following directives generates the single-precision 
number 1.0: 


-float 1.0e+0 
.float Ox1.0hOx7£ 


The assembler uses normal (nearest) rounding mode to convert floating-point 
constants. 


2.4.3 String Constants 


All characters except the newline character are allowed in string constants. 
String constants begin and end with double quotation marks ("). 


The assembler observes most of the backslash conventions used by the C 
language. Table 2-1 shows the assembler’s backslash conventions. 
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Table 2-1: Backslash Conventions 


Convention Meaning 

\a Alert (0x07) 

\b Backspace (0x08) 

\f Form feed (Ox0c) 

\n Newline (0x0a) 

\r Carriage return (OxOd) 

\t Horizontal tab (0x09) 

\v Vertical feed (Ox0b) 

\\ Backslash (O0x5c) 

\" Quotation mark (0x22) 

\’ Single quote (0x27) 

\nnn Character whose octal value is nnn (where n is 0-7) 
\Xnn Character whose hexadecimal value is nn 


(where n is 0-9, a-f, or A-F) 


Deviations from C conventions are as follows: 

* Theassembler does not recognize “\ ?”. 

*« Theassembler does not recognize the prefix “L” (wide character constant). 
¢« Theassembler limits hexadecimal constants to two characters. 


« The assembler allows the leading “x” character in a hexadecimal 
constants to be either uppercase or lowercase; that is, both \ xnn and 
\ Xnn are allowed. 


For octal notation, the backslash conventions require three characters when 
the next character could be confused with the octal number. 


For hexadecimal notation, the backslash conventions require two characters 
when the next character could be confused with the hexadecimal number. 
Insert a O (zero) as the first character of the single-character hexadecimal 
number when this condition occurs. 


2.5 Multiple Lines Per Physical Line 


You can include multiple statements on the same line by separating the 
statements with semicolons. Note, however, that the assembler does not 
recognize semicolons as separators when they follow comment symbols 
(# or /*). 
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2.6 Statements 


The assembler supports the following types of statements: 

¢« Null statements 

¢« Keyword statements 

Each keyword statement can include an optional label, an operation code 


(mnemonic or directive), and zero or more operands (with an optional 
comment following the last operand on the statement): 


[label]: opcode operand [; opcodeoperand; ...] [# comment] 
Some keyword statements also support relocation operands (See 
Section 2.6.4). 

2.6.1 Labels 


Labels can consist of label definitions or numeric values: 


¢ A label definition consists of an identifier followed by a colon. (See 
Section 2.3 for the rules governing identifiers.) Label definitions assign 
the current value and type of the location counter to the name. An error 
results when the name is already defined. 


Label definitions always end with a colon. You can put a label definition 
on a line by itself. 


« A numeric label is a single numeric value (1-255). Unlike label 
definitions, the value of a numeric label can be applied to any number 
of statements in a program. To reference a numeric label, put an f£ 
(forward) or a b (backward) immediately after the referencing digit in an 
instruction, for example, br 7£ (which is a forward branch to numeric 
label 7). The reference directs the assembler to look for the nearest 
numeric label that corresponds to the specified number in the lexically 
forward or backward direction. 


2.6.2 Null Statements 


A null statement is an empty statement that the assembler ignores. Null 
statements can have label definitions. For example, the following line has 
three null statements in it: 


label: ; ; 
2.6.3 Keyword Statements 


A keyword statement contains a predefined keyword. The syntax for the rest 
of the statement depends on the keyword. Keywords are either assembler 
instructions (mnemonics) or directives. 
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Assembler instructions in the main instruction set and the floating-point 
instruction set are described in Chapter 3 and Chapter 4, respectively. 
Assembler directives are described in Chapter 5. 


2.6.4 Relocation Operands 


Relocation operands are generally useful in only two situations: 


¢ In application programs in which the programmer needs precise control 
over scheduling 


e In source code written for compiler development 


Some macro instructions (for example, 1dgp) require special coordination 
between the machine-code instructions and the relocation sequences given 
tothe linker. By using the macro instructions, the assembler programmer 
relies on the assembler to generate the appropriate relocation sequences. 


In some instances, the use of macro instructions may be undesirable. F or 
example, a compiler that supports the generation of assembly language 
files may not want to defer instruction scheduling to the assembler. Such a 
compiler will want to schedule some or all of the machine-code instructions. 
To do this, the compiler must have a mechanism for emitting an object file’s 
relocation sequences without using macro instructions. The mechanism for 
establishing these sequences is the relocation operand. 


A relocation operand can be placed after the normal operand on an assembly 
language statement: 


opcode operand rdocation_operand 
The relocation_operand has the following form 


!rdocation_type sequence number 
relocation_type 


Any one of the following relocation types can be specified: 


literal 
lituse base 
lituse bytoff 
lituse jsr 
gpdisp 
gprelhigh 
gprellow 
tlsliteral 
tlshigh 
tlslow 
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The relocation types must be enclosed within a pair of exclamation 
points (!) and are not case-sensitive. See the Symbol Table’ Object 
File Specification manual for descriptions of the different types of 

relocation operations. 


sequence number 


The sequence number is a numeric constant with a value range of 1 to 
2147483647. The constant can be base 8, 10, or 16. Bases other than 
10 require a prefix (see Section 2.4.1). 


The following examples contain relocation operands in the source code: 


Example 1 — Referencing multiple lituse_ base relocations: 


# Equivalent C statement: 
# syml += sym2 (Both external) 


# Assembly statements containing macro instructions: 
ldq $1, syml 

ldq $2, sym2 

addq $1, $2, $3 

stq $3, syml 


# Assembly statements containing machine-code instructions 
# requiring relocation operands: 

ldq $1, syml1($gp) !literal!1 

ldq $2, sym2(Sgp) !literal!2 


ldq $3, syml($1)!lituse base!1 
ldq $4, sym2($1)!lituse _base!2 
addq $3, $4, $3 

stq $3, syml($1)!lituse base!1 


The assembler stores the sym1 and sym2 address constants inthe .lita 
section. 


In this example, the code with relocation operands provides better 
performance than the other code because it saves on register usage and 
on the length of machine-code instruction sequences. 


Example 2 — Referencing an 1dgp sequence that is scheduled insidea 
lituse_base relocation: 


# Assembly statements containing macro instructions: 
beq $2, L 

stg $31, sym 

ldgp $gp, 0($27) 


# Assembly statements containing machine-code instructions that 


# require relocation operands: 
ldq Sat, sym(Sgp) !literal!1 
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beg $2, L # crosses basic block boundary 

ldah S$gp, 0($27) !gpdisp!2 

stq $31, sym(Sat) !lituse_base!1 

lda S$gp, 0($gp) !gpdisp!2 

In this example, the programmer has elected to schedule the load of the 
address of sym before the conditional branch. 


« Example 3 — A routine call: 


# Assembly statements containing macro instructions: 
jsr sym1 
ldgp $gp, 0($ra) 


.extern syml 
.text 


# Assembly statements containing machine-code instructions that 
# require relocation operands: 

ldq $27, syml1($gp) !literal!1 

jsr $26, ($27), syml!lituse jsr!1 

# asl puts in an R_HINT for the jsr instruction 

ldah $gp, 0($ra) !gpdisp!2 

lda S$gp, 0($gp) !gpdisp!2 

In this example, the code with relocation operands does not provide any 
significant gains over the other code. This example is only provided to 
show the different coding methods. 


2.7 Expressions 


An expression is a sequence of symbols that represents a value. Each 
expression and its result have data types. The assembler does arithmetic 
in two's complement integers with 64 bits of precision. Expressions follow 
precedence rules and consist of the following elements: 


¢ Operators 
¢ Identifiers 
¢ Constants 


You can also use a single character string in place of an integer within 
an expression. For example, the following two pairs of statements are 


equivalent: 
-byte "a" ; .word "a"+0x19 
-byte 0x61 ; .word O0x7a 


2.7.1 Expression Operators 


The assembler supports the operators shown in Table 2-2. 
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Table 2-2: Expression Operators 


Operator Meaning 
+ Addition 
: Subtraction 
* Multiplication 
/ Division 
% Remainder 
= Shift left 
>> Shift right (sign is not extended) 
ip: Bitwise EXCLUSIVE OR 
& Bitwise AND 
| Bitwise OR 
7 Minus (unary) 
+ Identity (unary) 


Complement 


2.7.2 Expression Operator Precedence Rules 


For the order of operator evaluation within expressions, you can rely on the 
precedence rules or you can group expressions with parentheses. Unless 
parentheses enforce precedence, the assembler evaluates all operators 

of the same precedence strictly from left to right. Because parentheses 

also designate index registers, ambiguity can arise from parentheses in 
expressions. To resolve this ambiguity, put a unary +in front of parentheses 


in expressions. 


The assembler has three precedence levels. Table 2-3 lists the precedence 
rules from lowest to highest. 
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Table 2-3: Operator Precedence 


Precedence 


Operators 


Least binding, lowest precedence Binary +, - 


Binary *, /, %, << >>, ~ &, | 


Most binding, highest precedence Unary -, + ~ 


Note 


The assembler’s precedence scheme differs from that of the C 


language. 


2.7.3 Data Types 


Each symbol you reference or define in an assembly program belongs to one 
of the type categories shown in Table 2-4. 


Table 2-4: Data Types 


Type 


Description 


undefined 


absolute 
text 


data 


sdata 


2-10 Lexical Conventions 


Any symbol that is referenced but not defined becomes 
global undefined. (Declaring such a symbol in a .glob1 
directive merely makes its status clearer.) 


A constant defined in an assignment (=) expression. 


Any symbol defined while the .text directive is in 
effect belongs to the text section. The text section 
contains the program's instructions, which are not 
modifiable during execution. 


Any symbol defined while the .data directive is in effect 
belongs to the data section. The data section contains 
memory that the linker can initialize to nonzero values 
before your program begins to execute. 


The type sdata is similar to the type data, except that defining 
a symbol while the . sdata (“small data”) directive is in effect 
causes the linker to place it within thesmall data section. This 
increases the chance that the linker will be able to optimize 
memory references tothe item by using gp-relative addressing. 


Table 2-4: Data Types (cont.) 


Type 


Description 


rdata and rconst 


bss and sbss 


Any symbol defined while the .rdata or .rconst directives 
arein effect belongs to this category. The only difference 
between the types rdata and rconst is that the former is 
allowed to have dynamic relocations and the latter is not. 
(The types rdata and rconst are also similar to the type data 
but, unlike data, cannot be modified during execution.) 


Any symbol defined in a .commor .1comm directive belongs 
to these sections, except that a .data, .sdata, .rdata, or 
.rconst directive can overridea .comm directive. The .bss 
and .sbss sections consist of memory that the kernel loader 
initializes to zero before your program begins to execute. 

If a symbol’s size is less than the number of bytes specified by 
the -G compilation option (which defaults to eight), it belongs 
to .sbss section (small bss section), and the linker places it 
within the small data section. This increases the chance that 
the linker will be able to optimize memory references to the 
item by using gp-relative addressing. 

Local symbols in the .bss or .sbss sections defined by 

. lcomm directives are allocated memory by the assembler, 
global symbols are allocated memory by the linker, and 
symbols defined by . comm directives are overlaid upon 
like-named symbols (in the fashion of Fortran COMMON 
blocks) by the linker. 


Symbols in the undefined category are always global; that is, they are 
visible to the linker and can be shared with other modules of your program. 
Symbols in the absolute, text, data, sdata, rdata, rconst, bss, and sbss type 
categories are local unless declared in a .glob1 directive. 


2.7.4 Type Propagation in Expressions 


For any expression, the result’s type depends on the types of the operands 
and the operator. The following type propagation rules are used in 


expressions: 


e If an operand is undefined, the result is undefined. 


¢ If both operands are absolute, the result is absolute. 


e |f the operator is a plus sign (+) and the first operand refers to an 
undefined external symbol or a relocatable symbol in a .text section, 
.data section, or .bss section, the result has the first operand’s type 
and the other operand must be absolute. 


e |f the operator is a minus sign (-) and the first operand refers toa 
relocatable symbol ina .text section, .data section, or .bss section, 
the type propagation rules can vary: 
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- Thesecond operand can be absolute (if it was previously defined) and 
the result has the first operand’s type. 


- Thesecond operand can have the same type as the first operand 
and the result is absolute. 


- If the first operand is external undefined, the second operand must 
be absolute. 


¢ Theoperators *, /, %, <<, >>, ~, *, & and | apply only to absolute 
symbols. 


2.8 Address Formats 


The assembler accepts addresses expressed in the formats described in 
Table 2-5. 


Table 2-5: Address Formats 
Format Address Description 


(base-register) Specifies an indexed address, which 
assumes a zero offset. The base register’s 
contents specify the address. 


expression Specifies an absolute address. The 
assembler generates the most locally 
efficient code for referencing the value 
at the specified address. 


expression (base—-register) Specifies a based address. To get the 
address, the value of the expression is 
added to the contents of the base register. 
The assembler generates the most locally 
efficient code for referencing the value 
at the specified address. 


relocatable-symbol Specifies a relocatable address. The 
assembler generates the necessary 
instructions to address the item and 
generates relocation information 
for the linker. 


relocatable—-symbol+expression Specifies a relocatable address. To get the 
address, the value of the expression, which 
has an absolute value, is added or subtracted 
from the relocatable symbol. The assembler 
generates the necessary instructions to 
address the item and generates relocation 
information for the linker. If the symbol 
name does not appear as a label anywhere 
in the assembly, the assembler assumes 
that the symbol is external. 


2-12 Lexical Conventions 


Table 2-5: Address Formats (cont.) 
Format Address Description 


relocatable-symbol (index-register) Specifies an indexed relocatable address. 
To get the address, the index register 
is added to the relocatable symbol’s 
address. The assembler generates the 
necessary instructions to address the 
item and generates relocation information 
for the linker. If the symbol name does 
not appear as a label anywhere in the 
assembly, the assembler assumes that 
the symbol is external. 


relocatable-symbol+expression(index-register) Specifies an indexed relocatable address. 
To get the address, the assembler adds 
or subtracts the relocatable symbol, the 
expression, and the contents of index 
register. The assembler generates the 
necessary instructions to address the 
item and generates relocation information 
for the link editor. If the symbol name 
does not appear as a label anywhere in 
the assembly, the assembler assumes 
that the symbol is external. 
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Main Instruction Set 


The assembler’s instruction set consists of a main instruction set and a 
floating-point instruction set. This chapter describes the main instruction 
set; Chapter 4 describes the floating-point instruction set. For details on the 
instruction set beyond the scope of this manual, see the Alpha Architecture 
Reference Manual. 


The assembler’s main instruction set contains the following classes of 
instructions: 


e Load and store instructions (Section 3.1) 

¢ Arithmetic instructions (Section 3.2) 

¢ Logical and shift instructions (Section 3.3) 

¢ Relational instructions (Section 3.4) 

¢ Move instructions (Section 3.5) 

¢ Control instructions (Section 3.6) 

¢ Bytemanipulation instructions (Section 3.7) 
¢ Special-purpose instructions (Section 3.8) 


Tables in this chapter show the format of each instruction in the main 
instruction set. The tables list the instruction names and the forms of 
operands that can be used with each instruction. The specifiers used in the 
tables to identify operands have the following meanings: 


Operand Specifier Description 


address A symbolic expression whose effective value is 
used as an address. 


b_reg Base register. An integer register containing a base 
address to which is added an offset (or displacement) 
value to produce an effective address. 


d_reg Destination register. An integer register that receives 
a value as a result of an operation. 


d_reg/s_reg One integer register that is used as both a destination 
register and a source register. 


label A label that identifies a location in a program. 
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Operand Specifier 


Description 


no_operands 


offset 


palcode 


s reg, s_regl, 
s_reg2 
val_expr 
val_immed 


jhint 


rhint 


No operands are specified. 


An immediate value that is added to the contents of a 
base register to calculate an effective address. 


A value that determines the operation performed 
by a PALcode instruction. 


Source registers whose contents are to be used 
in an operation. 


An expression whose value is used as an absolute value. 
An immediate value that is to be used in an operation. 


An address operand that provides a hint of wherea jmp 
or jsr instruction will transfer control. 


An immediate operand that provides software with a hint 
about how a ret or jsr_coroutine instruction is used. 


3.1 Load and Store Instructions 


Load and store instructions load immediate values and move data between 
memory and general registers. This section describes the general-purpose 
load and store instructions supported by the assembler. 


Table 3-1 lists the mnemonics and operands for instructions that perform 
load and store operations. The table is divided into groups of instructions. 
The operands specified within a particular group apply to all of the 
instructions contained in that group. 


Table 3-1: Load and Store Formats 


Instruction Mnemonic Operands 
Load Address lda? d_reg, address 
Load Byte ldb 

Load Byte Unsigned ldbu 

Load Word ldw 

Load Word Unsigned ldwu 

Load Sign Extended Longword 1a1¢ 

Load Sign Extended Longword Locked 1d1_1? 

Load Quadword ldq? 

Load Quadword Locked ldq_14 

Load Quadword Unaligned ldq_u? 


3-2 Main Instruction Set 


Table 3-1: Load and Store Formats (cont.) 


Instruction Mnemonic Operands 
Unaligned Load Word uldw (See previous page) 
Unaligned Load Word Unsigned uldwu 

Unaligned Load Longword uldl 

Unaligned Load Quadword uldg 

Load Address High ldah? d_reg, offset (b_reg) 
Load Global Pointer ldgp 

Load Immediate Longword ldil d_reg, val_expr 
Load Immediate Quadword ldiq 

Store Byte stb s_ reg, address 
Store Word stw 

Store Longword st19 

Store Longword Conditional stl_c3 

Store Quadword stq@ 

Store Quadword Conditional stq_c4 

Store Quadword Unaligned stq_u? 

Unaligned Store Word ustw 

Unaligned Store Longword ustl 

Unaligned Store Quadword ustq 


9 In addition tothe normal operands that can be specified with this instruction, relocation operands can also 
be specified (See Section 2.6.4). 


Section 3.1.1 describes the operations performed by load instructions and 
Section 3.1.2 describes the operations performed by store instructions. 


3.1.1 Load Instruction Descriptions 


Load instructions move values (addresses, values of expressions, or contents 
of memory locations) into registers. For all load instructions, the effective 
address is the 64-bit two’s-complement sum of the contents of the index 
register and the sign-extended offset. 


Instructions whose address operands contain symbolic labels imply an index 
register, which the assembler determines. Some assembler load instructions 
can produce multiple machine-code instructions (See Section C.4). 
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Note 


Load instructions can generate many code sequences for which 
the linker must fix the address by resolving external data items. 


Table 3-2 describes the operations performed by load instructions. 


Table 3-2: Load Instruction Descriptions 


Instruction 


Description 


Load Address (1da) 


Load Byte (1db) 


Load Byte Unsigned (1dbu) 


Load Word (1dw) 


Load Word Unsigned (1dwu) 


Load Sign Extended 
Longword (1d1) 
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Loads the destination register with the effective address 
of the specified data item. 


Loads the least significant byte of the destination register with the 
contents of the byte specified by the effective address. Because 
the loaded byte is a signed value, its sign bit is replicated to fill 
the other bytes in the destination register. (The assembler uses 
temporary registers AT and t9 for this instruction.) 


Loads the least significant byte of the destination register with the 
contents of the byte specified by the effective address. Because 
the loaded byte is an unsigned value, the other bytes of the 
destination register are cleared to zeros. (The assembler uses 
temporary registers AT and t9 for this instruction — unless the 
setting of the .arch directive or the —arch flag on the cc or as 
command line causes the assembler to generate a single machine 
instruction in response to the 1dbu instruction.) 


Loads the two least significant bytes of the destination register with 
the contents of the word specified by the effective address. Because 
the loaded word is a signed value, its sign bit is replicated to fill the 
other bytes in the destination register. 


If the effective address is not evenly divisible by two, a data-alignment 
exception may be signaled. (The assembler uses temporary registers 
AT and t9 for this instruction.) 


Loads the two least significant bytes of the destination register with 
the contents of the word specified by the effective address. Because the 
loaded word is an unsigned value, the other bytes of the destination 
register are cleared to zeros. 


If the effective address is not evenly divisible by two, a data alignment 
exception may be signaled. (The assembler uses temporary registers 
AT and t9 for this instruction — unless the setting of the .arch 
directive or the —arch flag on the cc or as command line causes the 
assembler to generate a single machine instruction in response to the 
1dwu instruction.) 


Loads the four least significant bytes of the destination register 

with the contents of the longword specified by the effective address. 
Because the loaded longword is a signed value, its sign bit is replicated 
to fill the other bytes in the destination register. 


If the effective address is not evenly divisible by four, a data-alignment 
exception is signaled. 


Table 3-2: Load Instruction Descriptions (cont.) 


Instruction Description 
Load Sign Extended Loads the four least significant bytes of the destination register 
Longword Locked (1d1_1) with the contents of the longword specified by the effective address. 


Because the loaded longword is a signed value, its sign bit is replicated 
to fill the other bytes in the destination register. 


If the effective address is not evenly divisible by four, a data-alignment 
exception is signaled. 


If an 1d1_1 instruction executes without generating an exception, 
the processor records the target physical address in a per-processor 
locked-physical-address register and sets the per-processor lock flag. 


If the per-processor lock flag is still set when a st1_c instruction is 
executed, the store occurs; otherwise, it does not occur. 


Load Quadword (1dq) Loads the destination register with the contents of the quadword 
specified by the effective address. All bytes of the register are replaced 
with the contents of the loaded quadword. 


If the effective address is not evenly divisible by eight, a 
data-alignment exception is signaled. 


If a literal relocation type is specified in the 1dq instruction, one 
machine instruction is generated and the symbol and offset are stored 
in the .1ita section. Other relocation types generate a sequence of 
instructions and the symbol and offset are stored in that sequence. 


Load Quadword Locked Loads the destination register with the contents of the quadword 
(ldq_1) specified by the effective address. All bytes of the register are replaced 
with the contents of the loaded quadword. 


If the effective address is not evenly divisible by eight, a 
data-alignment exception is signaled. 


If an 1dq_1 instruction executes without generating an exception, 
the processor records the target physical address in a per-processor 
locked-physical-address register and sets the per-processor lock flag. 


If the per-processor lock flag is still set when a stq_c instruction is 
executed, the store occurs; otherwise, it does not occur. 


Load Quadword Unaligned Loads the destination register with the contents of the quadword 
(1dq_u) specified by the effective address (with the three low-order 
= bits deared). The address does not have to be aligned on an 
8-byte boundary; it can be any byte address. 


Unaligned Load Word (uldw) — Loads the two least significant bytes of the destination register with 
the word at the specified address. The address does not have to be 
aligned on a 2-byte boundary; it can be any byte address. Because 
the loaded word is a signed value, its sign bit is replicated to fill 
the other bytes in the destination register. (The assembler uses 
temporary registers AT, t9, and t10 for this instruction.) 


Unaligned Load Word Loads the two least significant bytes of the destination register 

Unsigned (uldwu) with the word at the specified address. The address does not have 
to be aligned on a 2-byte boundary; it can be any byte address. 
Because the loaded word is an unsigned value, the other bytes of 
the destination register are cleared to zeros. (The assembler uses 
temporary registers AT, t9, and t10 for this instruction.) 
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Table 3-2: Load Instruction Descriptions (cont.) 


Instruction 


Description 


Unaligned Load Longword 
(uld1) 


Unaligned Load Quadword 
(uldq) 


Load Address High (1dah) 


Load Global Pointer (1dgp) 


Load Immediate Longword 
(1dil) 


Load Immediate Quadword 
(1diq) 


Loads the four least significant bytes of the destination register 
with the longword at the specified address. The address does 
not have to be aligned on a 4-byte boundary; it can be any byte 
address in memory. (The assembler uses temporary registers 
AT, t9, and t10 for this instruction.) 


Loads the destination register with the quadword at the specified 
address. The address does not have to be aligned on an 8-byte 
boundary; it can be any byte address in memory. (The assembler uses 
temporary registers AT, t9, and t10 for this instruction.) 


Loads the destination register with the effective address of the 
specified data item. In computing the effective address, the signed 
constant offset is multiplied by 65536 before adding to the base 
register. The signed constant must be in the range —32768 to 32767. 


Loads the destination register with the global pointer value for the 
procedure. The sum of the base register and the sign-extended 
offset specifies the address of the 1dgp instruction. 


Loads the destination register with the value of an expression that can 
be computed at assembly time. The value is converted to canonical 
longword form before being stored in the destination register; bit 31 
is replicated in bits 32 though 63 of the destination register. (See 
Appendix B for additional information on canonical forms.) 


Loads the destination register with the value of an expression 
that can be computed at assembly time. 


3.1.2 Store Instruction Descriptions 


For all storeinstructions, the effective address is the 64-bit two’s-complement 
sum of the contents of the index register and the sign-extended 16-bit offset. 


Instructions whose address operands contain symbolic labels imply an index 
register, which the assembler determines. Some assembler store instructions 
can produce multiple machine-code instructions (See Section C.4). 


Table 3-3 describes the operations performed by store instructions. 


Table 3-3: Store Instruction Descriptions 


Instruction 


Description 


Store Byte (stb) 
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Stores the least significant byte of the source register in the memory 
location specified by the effective address. (The assembler uses 
temporary registers AT, t9, and t10 for this instruction — unless 
the setting of the . arch directive or the —arch flag on the cc or as 
command line causes the assembler to generate a single machine 
instruction in response to the stb instruction.) 


Table 3-3: Store Instruction Descriptions (cont.) 


Instruction 


Description 


Store Word (stw) 


Store Longword (st1) 


Store Longword Conditional 
(stl_c) 


Store Quadword (stq) 


Store Quadword Conditional 
(stq_c) 


Store Quadword Unaligned 
(stq_u) 


Unaligned Store Word (ustw) 


Unaligned Store Longword 
(ust1) 


Unaligned Store Quadword 
(ustq) 


Stores the two least significant bytes of the source register in the 
memory location specified by the effective address. 

If the effective address is not evenly divisible by two, a data-alignment 
exception may be signaled. (The assembler uses temporary registers 
AT, t9, and t10 for this instruction — unless the setting of the .arch 
directive or the —arch flag on the cc or as command line causes the 
assembler to generate a single machine instruction in response to 
the stw instruction.) 


Stores the four least significant bytes of the source register in the 
memory location specified by the effective address. 


If the effective address is not evenly divisible by four, a data-alignment 
exception is signaled. 


Stores the four least significant bytes of the source register in the 
memory location specified by the effective address, if the lock flag is 
set. The lock flag is returned in the source register and is then set 
to zero. 


If the effective address is not evenly divisible by four, a data-alignment 
exception is signaled. 


Stores the contents of the source register in the memory location 
specified by the effective address. 


If the effective address is not evenly divisible by eight, a 
data-alignment exception is signaled. 


Stores the contents of the source register in the memory location 
specified by the effective address, if the lock flag is set. The lock flag is 
returned in the source register and is then set to zero. 


If the effective address is not evenly divisible by eight, a 
data-alignment exception is signaled. 


Stores the contents of the source register in the memory location 
specified by the effective address (with the three low-order bits cleared). 


Stores the two least significant bytes of the source register in 
the memory location specified by the effective address. The 
address does not have to be aligned on a 2-byte boundary; it can 
be any byte address. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for this instruction.) 


Stores the four least significant bytes of the source register in 
the memory location specified by the effective address. The 
address does not have to be aligned on a 4-byte boundary; it can 
be any byte address. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for this instruction.) 


Stores the contents of the source register in a memory location specified 
by the effective address. The address does not have to be aligned on 
an 8-byte boundary; it can be any byte address. (The assembler uses 
temporary registers AT, t9, t10, t11, and t12 for this instruction.) 
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3.2 Arithmetic Instructions 


Arithmetic instructions perform arithmetic operations on values in registers. 
(Floating-point arithmetic instructions are described in Section 4.3.) 


Table 3-4 lists the mnemonics and operands for instructions that perform 
arithmetic operations. The table is divided into groups of instructions. The 
operands specified within a particular group apply to all of the instructions 
contained in that group. 


Table 3—4: Arithmetic Instruction Formats 


Instruction Mnemonic Operands 

Clear clr d_reg 

Absolute Value Longword absl s_reg, d regord reg/s_reg 
Or val_immed, d_reg 

Absolute Value Quadword absq 

Negate Longword (without overflow) negl 

Negate Longword (with overflow) neglv 

Negate Quadword (without overflow) negq 

Negate Quadword (with overflow) negqv 

Sign-E xtension Byte sextb 

Sign-E xtension Longword sextl 

Sign-E xtension Word sextw 
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Table 3-4: Arithmetic Instruction Formats (cont.) 


Instruction Mnemonic Operands 

Add Longword (without overflow) addl s_regl, s_reg2, d regor 
Add Longword (with overflow) addlv es ee ee op 
Add Quadword (without overflow) addg a pega eat svete 
Add Quadword (with overflow) addqv 

Scaled Longword Add by 4 s4addl 

Scaled Quadword Add by 4 s4addq 

Scaled Longword Add by 8 s8addl 

Scaled Quadword Add by 8 s8addq 

Multiply Longword (without overflow) mull 

Multiply Longword (with overflow) mullv 

Multiply Quadword (without overflow) mulg 

Multiply Quadword (with overflow) mulqv 

Subtract Longword (without overflow) subl 

Subtract Longword (with overflow) sublv 

Subtract Quadword (without overflow) subg 

Subtract Quadword (with overflow) subqv 

Scaled Longword Subtract by 4 s4subl 

Scaled Quadword Subtract by 4 s4subq 

Scaled Longword Subtract by 8 s8subl 

Scaled Quadword Subtract by 8 s8subq 

Unsigned Quadword Multiply High umulh 

Divide Longword divl 

Divide Longword Unsigned divlu 

Divide Quadword divg 

Divide Quadword Unsigned divqu 

Longword Remainder reml 

Longword Remainder Unsigned remlu 

Quadword Remainder remq 

Quadword Remainder Unsigned remqu 


Table 3-5 describes the operations performed by arithmetic instructions. 
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Table 3-5: Arithmetic Instruction Descriptions 


Instruction 


Description 


Clear (clr) 


Absolute Value Longword 
(abs1) 


Absolute Value Quadword 
(absq) 


Negate Longword (without 
overflow) (neg1) 


Negate Longword (with 
overflow) (neglv) 


Negate Quadword (without 
overflow) (negq) 


Negate Quadword (with 
overflow) (negqv) 


Sign-E xtension Byte (sextb) 


Sign-E xtension Word (sextw) 


Sign-E xtension Longword 
(sext1) 


Add Longword (without 
overflow) (add1) 


Add Longword (with overflow) 


(addlv) 


Add Quadword (without 
overflow) (addq) 


3-10 Main Instruction Set 


Sets the contents of the destination register to zero. 


Computes the absolute value of the contents of the source register 
and places the result in the destination register. If the valuein the 
source register is -2147483648, an overflow exception is signaled. 


Computes the absolute value of the contents of the source register and 
places the result in the destination register. If the valuein the source 
register is -9223372036854775808, an overflow exception is signaled. 


Negates the integer contents of the four least significant bytes 
in the source register and places the result in the destination 
register. An overflow occurs if the value in the source register is 
-2147483648, but the overflow exception is not signaled. 


Negates the integer contents of the four least significant bytes 
in the source register and places the result in the destination 
register. If the value in the source register is -2147483648, 
an overflow exception is signaled. 


Negates theinteger contents of the source register and places the result 
in the destination register. An overflow occurs if the value in the source 
register is -2147483648, but the overflow exception is not signaled. 


Negates the integer contents of the source register and places the 
result in the destination register. An overflow exception is signaled 
if the value in the source register is -9223372036854775808. 


Moves the least significant byte of the source register into the 
least significant byte of the destination register. Because the 
moved byte is a signed value, its sign bit is replicated to fill 
the other bytes in the destination register. 


Moves the two least significant bytes of the source register into 
the two least significant bytes of the destination register. Because 
the moved word is a signed value, its sign bit is replicated to 

fill the other bytes in the destination register. 


Moves the four least significant bytes of the source register into 
the four least significant bytes of the destination register. Because 
the moved longword is a signed value, its sign bit is replicated 

to fill the other bytes in the destination register. 


Computes the sum of two signed 32-bit values. This instruction 
adds the contents of s_regi to the contents of s_reg2 or the 
immediate value and then places the result in the destination 
register. Overflow exceptions never occur. 


Computes the sum of two signed 32-bit values. This instruction 
adds the contents of s_regi1 to the contents of s_reg2 or the 
immediate value and then places the result in the destination 
register. If the result cannot be represented as a signed 32-bit 
number, an overflow exception is signaled. 


Computes the sum of two signed 64-bit values. This instruction 
adds the contents of s_regi to the contents of s_reg2 or the 
immediate value and then places the result in the destination 
register. Overflow exceptions never occur. 


Table 3-5: Arithmetic Instruction Descriptions (cont.) 


Instruction 


Description 


Add Quadword (with 
overflow) (addqv) 


Scaled Longword Add by 4 
(s4add1) 


Scaled Quadword Add by 4 
(s4addq) 


Scaled Longword Add by 8 
(s8add1) 


Scaled Quadword Add by 8 
(s8addq) 


Multiply Longword (without 
overflow) (mu11) 


Multiply Longword (with 
overflow) (mullv) 


Multiply Quadword (without 
overflow) (mulq) 


Multiply Quadword (with 
overflow) (mulqv) 


Subtract Longword (without 
overflow) (sub1) 


Subtract Longword (with 
overflow) (sublv) 


Computes the sum of two signed 64-bit values. This instruction 
adds the contents of s_regi to the contents of s_reg2 or the 
immediate value and then places the result in the destination 
register. If the result cannot be represented as a signed 64-bit 
number, an overflow exception is signaled. 


Computes the sum of two signed 32-bit values. This instruction 
scales (multiplies) the contents of s_ regi by four and then adds the 
contents of s_reg2 or theimmediate value. The result is stored in 
the destination register. Overflow exceptions never occur. 


Computes the sum of two signed 64-bit values. This instruction 
scales (multiplies) the contents of s_ regi by four and then adds the 
contents of s_reg2 or the immediate value. The result is stored in 
the destination register. Overflow exceptions never occur. 


Computes the sum of two signed 32-bit values. This instruction 
scales (multiplies) the contents of s_ regi by eight and then adds 
the contents of s_reg2 or the immediate value. The result is stored 
in the destination register. Overflow exceptions never occur. 


Computes the sum of two signed 64-bit values. This instruction 
scales (multiplies) the contents of s regi by eight and then adds 
the contents of s_reg2 or the immediate value. The result is stored 
in the destination register. Overflow exceptions never occur. 


Computes the product of two signed 32-bit values. This 
instruction places either the 32-bit product of s_ regi and 
s_reg2 or the immediate value in the destination register. 
Overflows are not reported. 


Computes the product of two signed 32-bit values. This instruction 
places either the 32-bit product of s_ reg1 and s_reg2 or the 
immediate value in the destination register. If an overflow 
occurs, an overflow exception is signaled. 


Computes the product of two signed 64-bit values. This instruction 
places either the 64-bit product of s_reg1 and s_reg2 or the 
immediate value in the destination register. Overflow is not reported. 


Computes the product of two signed 64-bit values. This instruction 
places either the 64-bit product of s_ reg1 and s_reg2 or the 
immediate value in the destination register. If an overflow 
occurs, an overflow exception is signaled. 


Computes the difference of two signed 32-bit values. This instruction 
subtracts either the contents of s_reg2 or an immediate value 
from the contents of s_reg1 and then places the result in the 
destination register. Overflow exceptions never happen. 


Computes the difference of two signed 32-bit values. This instruction 
subtracts either the contents of s_reg2 or an immediate value from 
the contents of s_reg1 and then places the result in the destination 
register. If the true result’s sign differs from the destination 
register’s sign, an overflow exception is signaled. 
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Table 3-5: Arithmetic Instruction Descriptions (cont.) 


Instruction Description 


Subtract Quadword (without Computes the difference of two signed 64-bit values. This instruction 

overflow) (subq) subtracts the contents of s_reg2 or an immediate value from the 
contents of s_reg1 and then places the result in the destination 
register. Overflow exceptions never occur. 


Subtract Quadword (with Computes the difference of two signed 64-bit values. This instruction 

overflow) (subqv) subtracts the contents of s_reg2 or an immediate value from the 
contents of s_reg1 and then places the result in the destination 
register. If the true result’s sign differs from the destination 
register’s sign, an overflow exception is signaled. 


Scaled Longword Subtract by Computes the difference of two signed 32-bit values. This instruction 
4(s4subl) subtracts the contents of s_reg2 or the immediate value from 
the scaled (by 4) contents of s_reg1. The result is stored in the 
destination register. Overflow exceptions never occur. 


Scaled Quadword Subtract by Computes the difference of two signed 64-bit values. This instruction 
4 (s4subq) subtracts the contents of s_reg2 or the immediate value from 
the scaled (by 4) contents of s_reg1. The result is stored in the 
destination register. Overflow exceptions never occur. 


Scaled Longword Subtract by Computes the difference of two signed 32-bit values. This instruction 
8 (s8sub1) subtracts the contents of s_reg2 or the immediate value from 
the scaled (by 8) contents of s_reg1. The result is stored in the 
destination register. Overflow exceptions never occur. 


Scaled Quadword Subtract by Computes the difference of two signed 64-bit values. This instruction 
8 (s8subq) subtracts the contents of s_reg2 or the immediate value from 
the scaled (by 8) contents of s_reg1. The result is stored in the 
destination register. Overflow exceptions never occur. 


Unsigned Quadword Multiply Computes the product of two unsigned 64-bit values. This instruction 
High (umulh) multiplies the contents of s_reg1 by the contents of s_reg2 or 

the immediate value and then places the high-order 64 bits of 

the 128-bit product in the destination register. 


Divide Longword (div1) Computes the quotient of two signed 32-bit values. This instruction 
divides the contents of s_reg1 by the contents of s_reg2 or the 
immediate value and then places the quotient in the destination 
register. 

The divi instruction rounds toward zero. If the divisor is zero, an 
error is signaled. Overflow is signaled when dividing -2147483648 
by -1. A call_pal PAL gentrap instruction may be issued for 
divide-by-zero and overflow exceptions. 


Divide Longword Unsigned Computes the quotient of two unsigned 32-bit values. This instruction 

(divlu) divides the contents of s_reg1 by the contents of s_reg2 or the 
immediate value and then places the quotient in the destination 
register. 
If the divisor is zero, an exception is signaled and a call_pal 
PAL_gentrap instruction may be issued. Overflow exceptions never 
occur. (The assembler uses temporary registers AT, t9, t10, t11, and 
t12 for the divlu instruction.) 
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Table 3-5: Arithmetic Instruction Descriptions (cont.) 


Instruction 


Description 


Divide Quadword (diva) 


Divide Quadword Unsigned 
(divqu) 


Longword Remainder (rem1) 


Longword Remainder 
Unsigned (rem1u) 


Computes the quotient of two signed 64-bit values. This instruction 
divides the contents of s_reg1 by the contents of s_reg2 or the 
immediate value and then places the quotient in the destination 
register. 

The divg instruction rounds toward zero. If the divisor is 

zero, an error is signaled. Overflow is signaled when dividing 
-9223372036854775808 by -1. Acall_pal PAL gentrap instruction 
may be issued for divide-by-zero and overflow exceptions. (The 
assembler uses temporary registers AT, t9, t10, t11, and t12 for the 
divg instruction.) 


Computes the quotient of two unsigned 64-bit values. This instruction 
divides the contents of s_regi by the contents of s_reg2 or the 
immediate value and then places the quotient in the destination 
register. 

If the divisor is zero, an exception is signaled and a call_pal 

PAL _gentrap instruction may be issued. Overflow exceptions never 
occur. (The assembler uses temporary registers AT, t9, t10, t11, and 
t12 for the divqu instruction.) 


Computes the remainder of the division of two signed 32-bit values. 
The remainder rem1 (i,3) iS defined as i- (j*divl (i,j)), where 

3 !=0. This instruction divides the contents of s_regi by the contents 
of s_reg2 or by theimmediate value and then places the remainder 
in the destination register. 


The rem1 instruction rounds toward zero, for example, 
divl(5,-3)=-1 and reml (5, -3) =2. 

For divide-by-zero, an error issignaled andacall_pal PAL gentrap 
instruction may be issued. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for the rem1 instruction.) 


Computes the remainder of the division of two unsigned 32-bit values. 
The remainder remlu (i,j) is defined as i- (j*divlu(i,j)), where 
3 !=0. This instruction divides the contents of s_regi by the contents 
of s_reg2 or the immediate value and then places the remainder in 
the destination register. 


For divide-by-zero, an error issignaled andacall_pal PAL gentrap 
instruction may be issued. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for the rem1lu instruction.) 
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Table 3-5: Arithmetic Instruction Descriptions (cont.) 


Instruction Description 


Quadword Remainder (remq) | Computes the remainder of the division of two signed 64-bit values. 
The remainder remg(i,3) is defined as i- (j*divq(i,j)) where 
3 !=0. This instruction divides the contents of s_regi by the contents 
of s_reg2 or the immediate value and then places the remainder in 
the destination register. 


The remq instruction rounds toward zero, for example, 
divq(5,-3)=-1 and remq(5, -3) =2. 

F or divide-by-zero, an error issignaledandacall_pal PAL gentrap 
instruction may be issued. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for the remgq instruction.) 


Quadword Remainder Computes the remainder of the division of two unsigned 64-bit values. 

Unsigned (remqu) The remainder remqu (i,j) is defined as i- (j*divqu(i,j)) where 
3 !=0. This instruction divides the contents of s_ regi by the contents 
of s_reg2 or the immediate value and then places the remainder in 
the destination register. 
For divide-by-zero, an error issignaled andacall_pal PAL gentrap 
instruction may be issued. (The assembler uses temporary registers 
AT, t9, t10, t11, and t12 for the remqu instruction.) 


3.3 Logical and Shift Instructions 


Logical and shift instructions perform logical operations and shifts on values 
in registers. 


Table 3-6 lists the mnemonics and operands for instructions that perform 
logical and shift operations. The table is divided into groups of instructions. 
The operands specified within a particular group apply to all of the 
instructions contained in that group. 


Table 3-6: Logical and Shift Instruction Formats 
Instruction Mnemonic Operands 


Logical Complement — NOT not s_ reg, d regord_reg/s_ reg 
Or val_immed, d_reg 
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Table 3-6: Logical and Shift Instruction Formats (cont.) 


Instruction Mnemonic Operands 
Logical Product — AND and s_regl, s_reg2, d regor 

; d_reg/s_regl, s_reg2 or 
Logical Sum — OR bis s_regl, val_immed, d_reg 
Logical Sum — OR ar or d_reg/s_reg1, val_immed 
Logical Difference — XOR xor 
Logical Product with Complement bic 
— ANDNOT 
Logical Product with Complement andnot 
— ANDNOT 


Logical Sum with Complement — ORNOT = ornot 


Logical Equivalence — XORNOT eqv 
Logical Equivalence — XORNOT xornot 
Shift Left Logical sll 
Shift Right Logical srl 
Shift Right Arithmetic sra 


Table 3-7 describes the operations performed by logical and shift 


instructions. 


Table 3-7: Logical and Shift Instruction Descriptions 


Instruction Description 


Logical Complement — NOT Computes the logical NOT of a value. This instruction performs 
(not) a complement operation on the contents of s_ regi and places 
the result in the destination register. 


Logical Product — AND (and) Computes the logical AND of two values. This instruction 
performs an AND operation between the contents of s_reg1 and 
either the contents of s_reg2 or the immediate value and then 
places the result in the destination register. 


Logical Sum — OR (bis) Computes the logical OR of two values. This instruction 
performs an OR operation between the contents of s_reg1 and 
either the contents of s_reg2 or the immediate value and then 
places the result in the destination register. 


Logical Sum — OR (or) Synonym for bis. 
Logical Difference — XOR Computes the XOR of two values. This instruction performs 
(xor) an XOR operation between the contents of s_reg1 and either 


the contents of s_reg2 or the immediate value and then places 
the result in the destination register. 


Logical Product with Computes the logical AND of two values. This instruction performs 
Complement — ANDNOT an AND operation between the contents of s_ regi and the one’s 
(bic) complement of either the contents of s_reg2 or the immediate value 


and then places the result in the destination register. 
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Table 3-7: Logical and Shift Instruction Descriptions (cont.) 


Instruction 


Description 


Logical Product with 
Complement — ANDNOT 
(andnot) 


Synonym for bic. 


Logical Sum with Complement Computes the logical OR of two values. This instruction performs 


— ORNOT (ornot) 


Logical Equivalence — 
XORNOT (eqv) 


Logical Equivalence — 
XORNOT (xornot) 


Shift Left Logical (s11) 


Shift Right Logical (sr1) 


Shift Right Arithmetic (sra) 


an OR operation between the contents of s_reg1 and the one’s 
complement of either the contents of s_reg2 or the immediate value 
and then places the result in the destination register. 


Computes the logical XOR of two values. This instruction performs 
an XOR operation between the contents of s_reg1 and the one’s 
complement of either the contents of s_reg2 or the immediate value 
and then places the result in the destination register. 


Synonym for eqv. 


Shifts the contents of a register left (toward the sign bit) and inserts 
zeros in the vacated bit positions. Register s_reg1 contains the value 
to be shifted, and either the contents of s_reg2 or the immediate 
value specifies the shift count. If s_reg2 or the immediate value 

is greater than 63 or less than zero, s_reg1 shifts by the result 

of the following AND operation: s_reg2 AND 63. 


Shifts the contents of a register to the right (toward the least 
significant bit) and inserts zeros in the vacated bit positions. 
Register s_reg1 contains the value to be shifted, and either the 
contents of s_reg2 or the immediate value specifies the shift 
count. If s_reg2 or the immediate value is greater than 63 or 
less than zero, s_regi shifts by the result of the result of the 
following AND operation: s_reg2 AND 63. 


Shifts the contents of a register to the right (toward the least significant 
bit) and inserts the sign bit in the vacated bit position. Register 

s_ regi contains the value to be shifted, and either the contents of 
s_reg2 or the immediate value specifies the shift count. If s_reg2 
or the immediate value is greater than 63 or less than zero, s_reg1 
shifts by the result of the following AND operation: s_reg2 AND 63. 


3.4 Relational Instructions 


Relational instructions compare values in registers. 


Table 3-8 lists the mnemonics and operands for instructions that perform 
relational operations. Each of the instructions listed in the table can take an 
operand in any of the forms shown. 
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Table 3-8: Relational Instruction Formats 


Instruction 


Mnemonic Operands 


Compare Signed Quadword Equal 


cmpeq s regl, s_reg2, d_reg or 
d_reg/s_regl, s_reg2 or 


Compare Signed Quadword Less Than cmplt s regl, val immed, d regor 


Compare Signed Quadword Less Than or Equal cmple 


d_reg/s_reg1, val_immed 


Compare Unsigned Quadword Less Than cmpult 


Compare U nsigned Quadword Less Than or Equal cmpule 


Table 3-9 describes the operations performed by relational instructions. 


Table 3-9: Relational Instruction Descriptions 


Instruction 


Description 


Compare Signed Quadword 
Equal (cmpeq) 


Compare Signed Quadword 
Less Than (cmp1t) 


Compare Signed Quadword 
Less Than or Equal (cmp1e) 


Compare Unsigned Quadword 
Less Than (cmpul1t) 


Compare Unsigned Quadword 
Less Than or Equal (cmpule) 


Compares two 64-bit values. If the valuein s_reg1 equals the valuein 
s_reg2 or the immediate value, this instruction sets the destination 
register to one; otherwise, it sets the destination register to zero. 


Compares two signed 64-bit values. If the value in s_reg1 
is less than the valuein s_reg2 or the immediate value, this 
instruction sets the destination register to one; otherwise, 

it sets the destination register to zero. 


Compares two signed 64-bit values. If the valuein s_ regi is 
less than or equal to the value in s_reg2 or the immediate 
value, this instruction sets the destination register to one; 
otherwise, it sets the destination register to zero. 


Compares two unsigned 64-bit values. If the value in s_reg1 
is less than either the value in s_reg2 or the immediate value, 
this instruction sets the destination register to one; otherwise, 
it sets the destination register to zero. 


Compares two unsigned 64-bit values. If the valuein s_ regi is 
less than or equal to either the valuein s_reg2 or the immediate 
value, this instruction sets the destination register to one; 
otherwise, it sets the destination register to zero. 


3.5 Move Instructions 


Move instructions move data between registers. 


Table 3-10 lists the mnemonics and operands for instructions that perform 
move operations. The table is divided into groups of instructions. The 
operands specified within a particular group apply to all of the instructions 
contained in that group. 
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Table 3-10: Move Instruction Formats 


Instruction Mnemonic Operands 
Move mov s_reg, d_reg Or val_immed, 
d_reg 
Move if Equal to Zero cmoveq s_regl, s_reg2, d_regor 
; d_reg/s_regl, s_reg2 9 
Move if Not Equal to Zero cmovne s_regl, val_immed, d_regor 
: d reg/s regl, val immed 
Move if Less Than Zero cmovlt eg /e peg tn vale 
Move if Less Than or Equal to Zero cmovle 
Move if Greater Than Zero cmovgt 
Move if Greater Than or Equal to Zero cmovge 
Move if Low Bit Clear cmovlbc 
Move if Low Bit Set cmovlbs 


Table 3-11 describes the operations performed by move instructions. 


Table 3-11: Move Instruction Descriptions 


Instruction Description 


Move (mov) Moves the contents of the source register or the immediate 
value to the destination register. 


Move if Equal to Zero (cmoveg) Moves the contents of s_reg2 or the immediate value to the 
destination register if the contents of s_reg1 is equal to zero. 


Move if Not Equal to Zero Moves the contents of s_reg2 or the immediate value to the 
(cmovne) destination register if the contents of s_reg1 is not equal to zero. 
Move if Less Than Zero Moves the contents of s_reg2 or the immediate value to the 
(cmovl1t) destination register if the contents of s_regi is less than zero. 

Move if Less Than or Equal to Movesthecontents of s_reg2 or theimmediate valueto the destination 
Zero (cmovle) register if the contents of s_ regi is less than or equal to zero. 

Move if Greater Than Zero Moves the contents of s_reg2 or the immediate value to the 
(cmovgt) destination register if the contents of s_reg1 is greater than zero. 
Move if Greater Than or Equal Movesthecontents of s_reg2 or theimmediate valueto the destination 
to Zero (cmovge) register if the contents of s_reg1 is greater than or equal to zero. 
Move if Low Bit Clear Moves the contents of s_reg2 or the immediate value to the 
(cmovlbc) destination register if the low-order bit of s_ reg is equal to zero. 


Move if Low Bit Set (cmovlbs) Moves the contents of s_reg2 or the immediate value to the 
destination register if the low-order bit of s_reg1 is not equal to zero. 


3.6 Control Instructions 


Control instructions change the control flow of an assembly program. They 
affect the sequence in which instructions are executed by transferring 
control from one location in a program to another. 
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Table 3-12 lists the mnemonics and operands for instructions that perform 
control operations. The table is divided into groups of instructions. The 
operands specified within a particular group apply to all of the instructions 


contained in that group. 


Table 3-12: Control Instruction Formats 


Instruction Mnemonic Operands 

Branch if Equal to Zero beq s reg, label 

Branch if Not Equal to Zero bne 

Branch if Less Than Zero blt 

Branch if Less Than or Equal to Zero ble 

Branch if Greater Than Zero bgt 

Branch if Greater Than or Equal to Zero bge 

Branch if Low Bit is Clear blbc 

Branch if Low Bit is Set blbs 

Branch br d_reg, label or label 

Branch to Subroutine bsr 

J ump jmp d_reg, (s_reg), jhint or 
d reg, (s_reg) Of (s_reg), 

J ump to Subroutine jsr? jhint or (s_reg) of d_reg, 
address Of address 

Return from Subroutine ret 


J ump to Subroutine Return 


jsr_coroutine? 


d_reg, (s_reg), rhint or 
d_reg, (s_reg) Od _reg, 
rhint of d_regor (s_reg), 
rhint of (s_reg) Of rhint 
or no_operands 


9 In addition to the normal operands that can be specified with this instruction, relocation operands can also be specified 


(see Section 2.6.4). 


Table 3-13 describes the operations performed by control instructions. For 
all branch instructions described in the table, the branch destinations must 
be defined in the source being assembled, not in an external source file. 


Table 3-13: Control Instruction Descriptions 


Instruction 


Description 


Branch if Equal to Zero (beq) Branches to the specified label if the contents of the 


source register is equal to zero. 


Branch if Not Equal to Zero 


Branches to the specified label if the contents of the source 
(bne) register is not equal to zero. 


Branch if Less Than Zero (b1t) Branches to the specified label if the contents of the source 
register is less than zero. The comparison treats the source 


register as a signed 64-bit value. 
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Table 3-13: Control Instruction Descriptions (cont.) 


Instruction 


Description 


Branch if Less Than or Equal 


to Zero (ble) 


Branch if Greater Than Zero 


(bgt) 


Branch if Greater Than or 
Equal to Zero (bge) 


Branch if Low Bit is Clear 


(blbc) 


Branch if Low Bit is Set (b1bs) 


Branch (br) 


Branch to Subroutine (bsr) 


J ump (jmp) 


J ump to Subroutine (5 sr) 


Return from Subroutine (ret) 


J ump to Subroutine Return 


(jsr_coroutine) 


Branches to the specified label if the contents of the source 
register is less than or equal to zero. The comparison treats 
the source register as a signed 64-bit value. 


Branches to the specified label if the contents of the source 
register is greater than zero. The comparison treats the 
source register as a signed 64-bit value. 


Branches to the specified label if the contents of the source 
register is greater than or equal to zero. The comparison treats 
the source register as a signed 64-bit value. 


Branches to the specified label if the low-order bit of the 
source register is equal to zero. 


Branches to the specified label if the low-order bit of the 
source register is not equal to zero. 


Branches unconditionally to the specified label. If a destination 
register is specified, the address of the instruction following 
the br instruction is stored in that register. 


Branches unconditionally to the specified label and stores the 
return address in the destination register. If a destination register 
is not specified, register $26 (ra) is used. 


Unconditionally jumps to a specified location. A symbolic address 
or the source register specifies the target location. If a destination 
register is specified, the address of the instruction following the 
jmp instruction is stored in the specified register. 


Unconditionally jumps to a specified location and stores the return 
address in the destination register. If a destination register is not 
specified, register $26 (ra) is used. A symbolic address or the source 
register specifies the target location. The instruction jsr procname 
transfers to procname and saves the return address in register $26. 


Unconditionally returns from a subroutine. If a destination register is 
specified, the address of the instruction following the ret instruction 
is stored in the specified register. The source register contains the 
return address. If the source register is not specified, register $26 (ra) 
is used. If a hint is not specified, a hint value of one is used. 


Unconditionally returns from a subroutine and stores the 
return address in the destination register. If a destination 
register is not specified, register $26 (ra) is used. The source 
register contains the target address. If the source register is 
not specified, register $26 (ra) is used. 
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All jump instructions (jmp, jsr, ret, and jsr_coroutine) perform 
identical operations. They differ only in hints to possible branch-prediction 
logic. See the Alpha Architecture Reference M anual for information about 
branch-prediction logic. 
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3.7 Byte-Manipulation Instructions 


Byte-manipulation instructions perform byte operations on values in 
registers. 


Table 3-14 lists the mnemonics and operands for instructions that perform 
byte-manipulation operations. Each of the instructions listed in the table 
can take an operand in any of the forms shown. 


Table 3-14: Byte-Manipulation Instruction Formats 


Instruction Mnemonic Operands 


Compare Byte cmpbge s_regl, s_reg2, d_regor 
d_reg/s_regl, s_reg20 s_ regi, 
val_immed, d_regoOr d_reg/s_regl, 


Extract Word Low extwl Val oimied 


Extract Byte Low extbl 


Extract Longword Low ext11 
Extract Quadword Low extql 
Extract Word High extwh 
Extract Longword High extlh 
Extract Quadword High extgh 


Insert Byte Low insbl 
Insert Word Low inswl 
Insert Longword Low insll 
Insert Quadword Low insql 
Insert Word High inswh 
Insert Longword High inslh 
Insert Quadword High insqgh 
Mask Byte Low mskb1l 
Mask Word Low mskwl 
Mask Longword Low mskll 
Mask Quadword Low mskql 
Mask Word High mskwh 
Mask Longword High msklh 
Mask Quadword High mskqh 
Zero Bytes zap 
Zero Bytes NOT zapnot 
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Table 3-15 describes the operations performed by byte-manipulation 


instructions. 


Table 3-15: Byte-Manipulation Instruction Descriptions 


Instruction 


Description 


Compare Byte (cmpbge) 


Extract Byte Low (extb1) 


Extract Word Low (extw1) 


Extract Longword Low (ext11) 


Extract Quadword Low 
(extq1) 


Extract Word High (ext wh) 


Extract Longword High 
(ext 1h) 


Extract Quadword High 


(ext gh) 


Insert Byte Low (insb1) 
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Performs eight parallel unsigned byte comparisons between 
corresponding bytes of register s_reg1 and s_reg2 or theimmediate 
value. A bit is set in the destination register if a byte in s_reg1 

is greater than or equal to the corresponding bytein s_reg2 or the 
immediate value. 


The results of the comparisons are stored in the eight low-order bits of 
the destination register; bit 0 of the destination register corresponds 
to byte 0 and so forth. The 56 high-order bits of the destination 
register are cleared. 


Shifts the register s regi right by 0-7 bytes, inserts zeros into 
the vacated bit positions, and then extracts the low-order byte 

into the destination register. The seven high-order bytes of the 
destination register are cleared to zeros. Bits 0-2 of register s_reg2 
or the immediate value specify the shift count. 


Shifts the register s_regi right by 0-7 bytes, inserts zeros into the 
vacated bit positions, and then extracts the two low-order bytes and 
stores them in the destination register. The six high-order bytes of 
the destination register are ceared to zeros. Bits 0-2 of register 
s_reg2 or the immediate value specify the shift count. 


Shifts the register s_regi right by 0-7 bytes, inserts zeros into the 
vacated bit positions, and then extracts the four low-order bytes and 
stores them in the destination register. The four high-order bytes 
of the destination register are deared to zeros. Bits 0-2 of register 
s_reg2 or the immediate value specify the shift count. 


Shifts the register s regi right by 0-7 bytes, inserts zeros into 
the vacated bit positions, and then extracts all eight bytes and 
stores them in the destination register. Bits 0-2 of register s_ reg2 
or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the 
vacated bit positions, and then extracts the two low-order bytes and 
stores them in the destination register. The six high-order bytes of 
the destination register are deared to zeros. Bits 0-2 of register 
s_reg2 or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into the 
vacated bit positions, and then extracts the four low-order bytes and 
stores them in the destination register. The four high-order bytes 
of the destination register are cleared to zeros. Bits 0-2 of register 
s_reg2 or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts zeros into 

the vacated bit positions, and then extracts all eight bytes and 
stores them in the destination register. Bits 0-2 of register s_ reg2 
or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts the byte intoa field 
of zeros, and then places the result in the destination register. Bits 0-2 
of register s_reg2 or the immediate value specify the shift count. 


Table 3-15: Byte-Manipulation Instruction Descriptions (cont.) 


Instruction 


Description 


Insert Word Low (insw1) 


Insert Longword Low (ins11) 


Insert Quadword Low (insq1) 


Insert Word High (inswh) 


Insert Longword High (ins1h) 


Insert Quadword High (insgh) 


Mask Byte Low (mskb1) 


Mask Word Low (mskw1) 


Mask Longword Low (msk11) 


Mask Quadword Low (mskq1) 


Mask Word High (mskwh) 


Mask Longword High (msk1h) 


Mask Quadword High (mskgh) 


Shifts the register s_reg1 left by 0-7 bytes, inserts the word into a field 
of zeros, and then places the result in the destination register. Bits 0-2 
of register s_reg2 or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts the longword into a 
field of zeros, and then places the result in the destination register. Bits 
0-2 of register s_reg2 or the immediate value specify the shift count. 


Shifts the register s_reg1 left by 0-7 bytes, inserts the quadwordintoa 
field of zeros, and then places the result in the destination register. Bits 
0-2 of register s_reg2 or the immediate value specify the shift count. 


Shifts the register s_ regi right by 0-7 bytes, inserts the word into a 
field of zeros, and then places the result in the destination register. Bits 
0-2 of register s_reg2 or the immediate value specify the shift count. 


Shifts the register s_ regi right by 0-7 bytes, inserts the 
longword into a field of zeros, and then places the result in 
the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the shift count. 


Shifts the register s_ regi right by 0-7 bytes, inserts the 
quadword into a field of zeros, and then places the result in 
the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the shift count. 


Sets a byte in register s_reg1 to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the byte. 


Sets a word in register s_reg1 to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the word. 


Sets a longword in register s_ regi to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the longword. 


Sets a quadword in register s_reg1 to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the quadword. 


Sets a word in register s_reg1 to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the word. 


Sets a longword in register s_ regi to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the longword. 


Sets a quadword in register s_reg1 to zero and stores the result 
in the destination register. Bits 0-2 of register s_reg2 or the 
immediate value specify the offset of the quadword. 
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Table 3-15: Byte-Manipulation Instruction Descriptions (cont.) 


Instruction Description 


Zero Bytes (zap) Sets selected bytes of register s regi to zero and places the 
result in the destination register. Bits 0-7 of register s_reg2 or 
an immediate value specify the bytes to be cleared to zeros. Each 
bit corresponds to one byte in register s_reg1; for example, bit 
0 corresponds to byte 0. A bit with a value of one indicates its 
corresponding byte should be cleared to zeros. 


Zero Bytes NOT (zapnot) Sets selected bytes of register s_ regi to zero and places the 
result in the destination register. Bits 0-7 of register s_reg2 or 
an immediate value specify the bytes to be cleared to zeros. Each 
bit corresponds to one byte in register s_reg1; for example, bit 
0 corresponds to byte 0. A bit with a value of zero indicates its 
corresponding byte should be cleared to zeros. 


3.8 Special-Purpose Instructions 


Special-purpose instructions perform miscellaneous tasks. 


Table 3-16 lists the mnemonics and operands for instructions that perform 
special operations. The table is divided into groups of instructions. The 
operands specified within a particular group apply to all of the instructions 
contained in that group. 


Table 3-16: Special-Purpose Instruction Formats 


Instruction Mnemonic Operands 

Call Privileged Architecture Library call_pal palcode 

Architecture Mask amask s_reg, d_reg or 
val_immed, d_reg 

Prefetch Data fetch offset (b_reg) 

Prefetch Data, Modify Intent fetch_m 

Read Process Cycle Counter rpcc d regord reg, reg 

Implementation Version implver d_reg 

No Operation nop no_operands 

Universal No Operation unop 

Trap Barrier trapb 

Exception Barrier excb 

Memory Barrier mb 

Write Memory Barrier wmb 

Count Leading Zero ctlz s reg, d_reg 
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Table 3-16: Special-Purpose Instruction Formats (cont.) 


Instruction Mnemonic Operands 
Count Population ctpop (See previous page) 
Count Trailing Zero cttz 


Table 3-17 describes the operations performed by special-purpose 
instructions. 


Table 3-17: Special-Purpose Instruction Descriptions 


Instruction Description 

Call Privileged Architecture —Unconditionally transfers control to the exception handler. The 
Library (call_pal) palcode operand is interpreted by software conventions. 
Architecture Mask (amask) The value of the contents of s_reg or the immediate value represent 


a mask of architectural extensions that are being requested. Bits 
are cleared if they correspond to architectural extensions that are 
present, and the result is placed in the destination register. 


Prefetch Data (fetch) Indicates that the 512-byte block of data specified by 
the effective address should be moved to a faster-access 
part of the memory hierarchy. 


Prefetch Data, Modify Intent | ndicates that the 512-byte block of data specified by the 
(£etch_m) effective address should be moved to a faster-access part of 
~ the memory hierarchy. In addition, this instruction is a hint 
that part or all of the data may be modified. 


Read Process Cycle Counter Returns the contents of the process cycle counter in the destination 

(xrpcc) register. If reg is specified, the rpcc instruction is not issued until all 
previous instructions that generate a result in reg are completed. If 
R31 is specified as the reg operand, the reg operand is ignored and 
the rpcc instruction does not wait for any preceding computation. 


Implementation Version A small integer is placed in the destination register. This integer 

(implver) specifies the major implementation version of the processor on which 
it is executed. This information can be used to make code-scheduling 
or tuning decisions. The returned small integer can have the values 0, 
1, or 2. O indicates EV4, EV45, LCA, and LCA-45 Alpha chips (that is, 
21064, 21064A, 21066, 21068, and 21066A, respectively); 1 indicates 
an EV5 Alpha chip (21164); and 2 indicates an EV6 Alpha chip (21264). 


No Operation (nop) Has no effect on the machine state. 
Universal No Operation (unop) Has no effect on the machine state. 


Trap Barrier (trapb) Guarantees that all previous arithmetic instructions are completed, 
without incurring any arithmetic traps, before any instructions 
after the trapb instruction are issued. 


Exception Barrier (excb) Guarantees that all previous instructions complete any 
exception-related behavior or rounding-mode behavior before any 
instructions after the excb instruction are issued. 


Memory Barrier (mb) Used to serialize access to memory. See the Alpha Architecture 
Reference Manual for additional information on memory barriers. 
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Table 3-17: Special-Purpose Instruction Descriptions (cont.) 


Instruction 


Description 


Write Memory Barrier (wmb) 


Count Leading Zeros (ct 1z) 


Count Population (ctpop) 


Count Trailing Zeros (ct tz) 


Guarantees that all previous store instructions access memory before 
any store instructions issued after the wmb instruction. 


Counts the number of leading zeros in s_ reg, starting at the most 
significant bit position, and writes that count to d_reg. 


Counts the number of ones in s_reg and writes the count to d_reg. 


Counts the number of trailing zeros in s_reg, starting at the least 
significant bit position, and writes the count to d_reg. 
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Floating-Point Instruction Set 


This chapter describes the assembler’s floating-point instructions. See 
Chapter 3 for a description of the integer instructions. For details on the 
instruction set beyond the scope of this manual, see the Alpha Architecture 
Reference Manual. 


This chapter addresses the following topics: 


¢ Background information on floating-point operations — data types, the 
control register, exceptions, rounding modes, and qualifiers (Section 4.1) 


¢ Theinstructions in the assembler’s floating-point instruction set, which 
consists of the following classes: 


- Load and store instructions (Section 4.2) 
- Arithmetic instructions (Section 4.3) 

- Relational instructions (Section 4.4) 

- Moveinstructions (Section 4.5) 

- Control instructions (Section 4.6) 

- Special-purpose instructions (Section 4.7) 


A particular floating-point instruction may be implemented in hardware, 
software, or a combination of hardware and software. 


Tables in this chapter show the format for each instruction in the 
floating-point instruction set. The tables list the instruction names and the 
forms of operands that can be used with each instruction. The specifiers 
used in the tables to identify operands have the following meanings: 


Operand Specifier Description 


address A symbolic expression whose effective value 
is used as an address. 


d_reg Destination register. A floating-point register 
that receives a value as a result of an operation. 


2d_reg/ s_reg One floating-point register that is used as both a 
destination register and a source register. 


label A label that identifies a location in a program. 
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Operand Specifier Description 


s reg, s_regl, s_reg2 Source registers. Floating-point registers whose 


contents are to be used in an operation. 


val_expr An expression whose value is a floating- 


point constant. 


The following terms are used to discuss floating-point operations: 


Term Meaning 
Infinite A value of HNF or -INF. 
Infinity A symbolic entity that represents values with magnitudes greater 


than the largest magnitude for a particular format. 


Ordered The usual result from a comparison, namely: less than 


(<), equal to (=), or greater than (>). 


NaN Symbolic entities that represent values not otherwise available in 


floating-point formats. (NaN is an acronym for not-a-number.) 


Unordered The condition that results from a floating-point comparison 


when one or both operands are NaNs. 


There are two kinds of NaNs: 


Quiet NaNs represent unknown or uninitialized values. 


Signaling NaNs represent symbolic values and values that are too big 
or too precise for the format. Signaling NaNs raise an invalid-operation 
exception whenever an operation is attempted on them. 


4.1 Background Information on Floating-Point Operations 


Topics addressed in the following sections include: 


Floating-point data types (Section 4.1.1) 

The floating-point control register (Section 4.1.2) 
Floating-point exceptions (Section 4.1.3) 
Floating-point rounding modes (Section 4.1.4) 
Floating-point instruction qualifiers (Section 4.1.5) 


4.1.1 Floating-Point Data Types 


Floating-point instructions operate on the following data types: 


D_floating (VAX double precision, limited support) 
F_floating (VAX single precision) 
G_floating (VAX double precision) 
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S_ floating (IEEE single precision) 

¢ T_floating (IEEE double precision) 

e Longword integer and quadword integer 

Figure 4-1 shows the memory formats for the single: and double-precision 
floating-point data types. 


Figure 4—1: Floating-Point Data Formats 


S_ floating 
31 30 23 22 0 
Sign Exponent | Fraction 
T_floating 
63 62 52 51 0 
F_floating 
31 16 15 14 76 ) 
Fraction : Fraction 
(low) Sign Exponent (high) 
D_floating 
63 48 47 32 31 16 15 14 76 0 
Fraction Fraction Fraction Stem || Sooner Fraction 
(low) (mid-low) (mid-high) 9 aqeteune (high) 
G_floating 
63 48 47 32 31 16 15 14 43 0 
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4.1.2 Floating-Point Control Register 


The floating-point control register (F PCR) contains status and control 
information. It controls the arithmetic rounding mode of instructions that 
specify dynamic rounding (d qualifier — see Section 4.1.5 for information 
on instruction qualifiers) and gives a summary for each exception type of 
the exception conditions detected by the floating-point instructions. It also 
contains an overall summary bit indicating whether an exception occurred. 


Figure 4-2 shows the format of the floating-point control register. 
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Figure 4-2: Floating-Point Control Register 
63 62 60595857 56 55 54 53 52 51 0 


taz/ 


sum ign dyn | iov | ine | unf | ovf | dze | inv raz/ign 
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The fields of the floating-point control register have the following meaning: 


Bits Name Description 

63 sum Summary — records the bitwise OR of the FPCR 
exception bits (bits 57 to 52). 

62-60 raz/ign Read-As-Zero — ignored when written. 

59-58 dyn Dynamic Rounding M ode — indicates the current 


rounding mode to be used by an IEEE floating-point 
instruction that specifies dynamic mode qualifier). 
The bit assignments for this field are as follows: 


00 - Chopped rounding mode 
01 - Minus infinity 

10 - Normal rounding 

11 - Plus infinity 


57 iov Integer overflow. 

56 ine Inexact result. 

55 unf Underflow. 

54 ove Overflow. 

53 dze Division by zero. 

52 inv Invalid operation. 

51-0 raz/ign Read-As-Zero — ignored when written. 


The floating-point exceptions associated with bits 57 to 52 are described 
in Section 4.1.3. 


4.1.3 Floating-Point Exceptions 


Six exception conditions can result from the use of floating-point instructions. 
All of the exceptions are signaled by an arithmetic exception trap. The 
exceptions are as follows: 


¢ Invalid Operation — An invalid-operation exception is signaled if 
any operand of a floating-point instruction, other than cmpt xx, is 
noninfinite. (The cmptxx instruction operates normally with plus and 
minus infinity.) This trap is always enabled. If this trap occurs, an 
unpredictable value is stored in the destination register. 
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Division by Zero — A division-by-zero exception is taken if the numerator 
does not cause an invalid-operation trap and the denominator is zero. 
This trap is always enabled. If this trap occurs, an unpredictable value is 
stored in the destination register. 


Overflow — An overflow exception is signaled if the rounded result 
exceeds the largest finite number of the destination format. This trap 
is always enabled. If this trap occurs, an unpredictable value is stored 
in the destination register. 


Underflow — An underflow exception occurs if the rounded result is 
smaller than the smallest finite number of the destination format. This 
trap can be disabled. If this trap occurs, a true zerois always stored 

in the destination register. 


Inexact Result — An inexact-result exception occurs if the infinitely 
precise result differs from the rounded result. This trap can be disabled. 
If this trap occurs, the normal rounded result is still stored in the 
destination register. 


Integer Overflow — An integer-overflow exception occurs if the 
conversion from a floating-point or integer format to an integer format 
results in a value that is outside of the range of values that the 
destination format can represent. This trap can be disabled. If this 
trap occurs, the true result is truncated to the number of bits in the 
destination format and stored in the destination register. 


4.1.4 Floating-Point Rounding Modes 


If atrue result can be exactly represented in a floating-point format, all 
rounding modes map the true result to that value. 


The following abbreviations are used in the descriptions of rounding modes 
provided in this section: 


LSB (least significant bit) — For a positive representable number, A, 
whose fraction is not all ones: A +1LSB is the next-larger representable 
number, and A + 1/2 LSB is exactly halfway between A and the 
next-larger representable number. 


MAX — The largest noninfinite representable floating-point number. 


MIN — The smallest nonzero representable normalized floating-point 
number. 


For VAX floating-point operations, two rounding modes are provided and are 
specified in each instruction: 


Normal rounding (biased): 
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Maps the true result to the nearest of two representable results, with 
trueresults exactly halfway between mapped tothe larger in absolute 
value. (Sometimes referred to as biased rounding away from zero.) 


Maps true results > MAX +1/2 LSB in magnitude to an overflow 
Maps true results <MIN - 1/2 LSB in magnitude to an underflow 


¢ Chopped rounding: 


Maps the true result to the smaller in magnitude of two surrounding 
representable results 


Maps true results > MAX +1 LSB in magnitude to an overflow 
Maps true results <MIN in magnitude to an underflow 


For lEEE floating-point operations, four rounding modes are provided: 


¢« Normal rounding (unbiased round to nearest): 


Maps the true result to the nearest of two representable results, 
with true results exactly halfway between being mapped to the 
one whose fraction ends in 0 (sometimes referred to as unbiased 
rounding to even) 


Maps true results > MAX +1/2 LSB in magnitude to an overflow 
Maps true results <MIN - 1/2 LSB in magnitude to an underflow 


e Rounding toward minus infinity: 


Maps the true results tothe smaller of two surrounding representable 
results 


Maps true results >MAX in magnitude to an overflow 
Maps positive true results <-HMIN to an underflow 
Maps negative true results > -MIN +1LSB toan underflow 


¢ Chopped rounding (round toward zero): 


Maps the true result to the smaller in magnitude of two surrounding 
representable results 


Maps true results > MAX +1 LSB in magnitude to an overflow 
Maps nonzero true results <MIN in magnitude to an underflow 


e Rounding toward plus infinity: 


Maps the true results to the larger of two surrounding representable 
results 


Maps true results >MAX in magnitude to an overflow 
Maps positive results <=-HMIN - 1 LSB toan underflow 
Maps negative true results >-MIN to an underflow 
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The first three of the IEEE rounding modes can be specified in the 
instruction. The last mode, rounding toward plus infinity, can be obtained 
by setting the floating-point control register (F PCR) to select it and then 
specifying dynamic rounding mode in the instruction. 


Dynamic rounding mode uses the |EEE rounding mode selected by the 
FPCR. It can be used with any of the |EEE rounding modes. (Dynamic 
rounding mode is described in Section 4.1.2.) 


Alpha lEEE arithmetic does rounding before detecting overflow or underflow. 


4.1.5 Floating-Point Instruction Qualifiers 


Many of the floating-point instructions accept a qualifier that specifies 
rounding and trapping modes. 


The following table lists the rounding mode qualifiers. See Section 4.1.4 for 
a detailed description of the rounding modes. 


Rounding Mode Qualifier 
VAX Rounding Mode 
Normal rounding (no modifier) 
Chopped c 
IEEE Rounding Mode 
Normal rounding (no modifier) 
Plus infinity d (ensure that the dyn field of the FPCR is 11) 
Minus infinity m 
Chopped Cc 


The following table lists the trapping mode qualifiers. See Section 4.1.3 for a 
detailed description of the exceptions. 


Trapping Mode Qualifier 
VAX Trap Mode 

Imprecise, underflow disabled (no modifier) 

Imprecise, underflow enabled u 

Software, underflow disabled s 

Software, underflow enabled su 


VAX Convert-to-Integer Trap Mode 
I mprecise, integer overflow disabled (no modifier) 
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Trapping Mode 


Qualifier 


Imprecise, integer overflow enabled 
Software, integer overflow disabled 


Software, integer overflow enabled 


IEEE Trap Mode 


Imprecise, underflow disabled, 
inexact disabled 


Imprecise, underflow enabled, 
inexact disabled 


Software, underflow enabled, inexact 
disabled 


Software, underflow enabled, inexact 
enabled 


IEEE Convert-to-integer Trap Mode 


Imprecise, integer overflow disabled, 
inexact disabled 


Imprecise, integer overflow enabled, 
inexact disabled 


Software, integer overflow enabled, 
inexact disabled 


Software, integer overflow enabled, 
inexact enabled 


Vv 
s 


SV 


(no modifier) 


su 


sui 


(no modifier) 


svi 


Table 4-1 lists the qualifier combinations that are supported by one or 

more of the individual instructions. The values in the Number column are 
referenced in subsequent sections to identify the combination of qualifiers 
accepted by the various instructions. 


Table 4—1: Qualifier Combinations for Floating-Point Instructions 


Number Qualifiers 
1 c, u, uc, gs, sc, su, Suc 
2 c,m, d, u, uc, um, ud, Su, suc, sum, sud, sui, Suic, Suim, suid 
3 s 
4 su 
5 Sv, V 
6 c, v, ve, Ss, sc, Sv, Svc 
7 Cc, Vv, vc, Sv, Svc, svi, svic, d, vd, svd, svid 
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Table 4—1: Qualifier Combinations for Floating-Point Instructions (cont.) 


Number Qualifiers 
8 Cc 


9 c, m, d, sui, suic, suim, suid 


4.2 Floating-Point Load and Store Instructions 


Floating-point load and store instructions load values and move data 
between memory and floating-point registers. 


Table 4-2 lists the mnemonics and operands for instructions that perform 
floating-point load and store operations. The table is divided into groups of 
functionally related instructions. The operands specified within a particular 
group apply to all of the instructions contained in that group. 


Table 4—2: Load and Store Instruction Formats 


Instruction Mnemonic Operands 

Load F_floating 1d£? d_reg, address 

Load G_floating (Load D_floating) ldg? 

Load S_floating (Load Longword) lds? 

Load T_floating (Load Quadword) ldté 

Load Immediate F_floating ldif d_reg, 
val_expr 

Load Immediate D_floating ldid 

Load Immediate G_floating ldig 

Load Immediate S_floating (Load Longword) ldis 

Load Immediate T_floating (Load Quadword) ldit 

Store F_floating stf? s reg, address 

Store G_floating (Store D_floating) stg? 

Store S_ floating (Store Longword) sts? 

Store T_floating (Store Quadword) stté 


9 |n addition tothe normal operands that can be specified with this instruction, relocation operands can also 
be specified (See Section 2.6.4). 


Table 4-3 describes the operations performed by floating-point load and 
store instructions. 


The load and store instructions are grouped by function. See Table 4-2 for 
the instruction names. 
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Table 4-3: Load and Store Instruction Descriptions 


Instruction 


Description 


Load Instructions 
(1df, 1dg, lds, ldt, 
ldif, ldid, ldig, 
ldis, ldit) 


Store Instructions 
(stf£, stg, sts, stt) 


Load eight bytes (G_, D_, and T_floating formats) or 

four bytes (F_ and S floating formats) from the specified 
effective address into the destination register. The address 
must be quadword aligned for 8-byte load instructions and 
longword aligned for 4-byte load instructions. 


Store eight bytes (G_, D_, and T_floating formats) 

or four bytes (F_ and S floating formats) from the 
source floating-point register into the specified effective 
address. The address must be quadword aligned 

for 8-byte store instructions and longword aligned 

for 4-byte store instructions. 


4.3 Floating-Point Arithmetic Instructions 


Floating-point arithmetic instructions perform arithmetic and logical 
operations on values in floating-point registers. 


Table 4—4 lists the mnemonics and operands for instructions that perform 
floating-point arithmetic and logical operations. The table is divided into 
groups of functionally related instructions. The operands specified within a 
particular group apply to all of the instructions contained in that group. 


The Qualifiers column in Table 4-4 refers to one or more trap or rounding 
modes as specified in Table 4-1. 


Table 4—4: Arithmetic Instruction Formats 


Instruction Mnemonic Qualifiers Operands 

Floating Clear felr = d_reg 

Floating Absolute Value fabs = s_reg, d_regord reg/s_reg 
Floating Negate fneg a 

Negate F_floating negf 3 

Negate G_floating negg 3 

Negate S floating negs 4 

Negate T_floating negt 4 
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Table 4—4: Arithmetic Instruction Formats (cont.) 


Instruction Mnemonic Qualifiers Operands 

Add F_floating addf 1 s_regl, s_reg2, d regor 
Add G_floating addg 1 d_reg/s_regl, s_reg2 
Add S_floating adds 2 

Add T_floating addt 2 

Divide F_floating divt 1 

Divide G_ floating divg 1 

Divide S_floating divs 2 

Divide T_floating divt 2 

Multiply F_floating mulf 1 

Multiply G_floating mulg 1 

Multiply S_floating muls 2 

Multiply T_floating mult 2 

Subtract F_floating subf 1 

Subtract G_floating subg 1 

Subtract S_floating subs 2 

Subtract T_floating subt 2 

Convert Quadword to Longword evtql 5 s_reg, d_regOrd reg/s reg 
Convert Longword to Quadword evtlg _ 

Convert G_floating to Quadword = cvtgq 6 

Convert T_floating toQuadword cvttq 7 

Convert Quadword toF_floating cvtqf 8 

Convert Quadword toG_ floating cvtag 8 

Convert Quadword toS floating cvtqs 9 

Convert Quadword toT floating cvtqt 9 

Convert D_floating to G_floating cvtdg 1 

Convert G_floating to D_ floating cvtgd 1 

Convert G_floating to F_floating cvtgf 1 

Convert T_floating to S floating cevtts 2 

Convert S_floating toT_floating cvtst 3 


Table 4-5 describes the operations performed by floating-point load and 
store instructions. The arithmetic instructions are grouped by function. See 


Table 4-4 for the instruction names. 
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Table 4—5: Arithmetic Instruction Descriptions 


Instruction 


Description 


Clear Instruction 
(f£clr) 


Absolute Value 
Instruction (fabs) 


Negate Instructions 
(£neg, negf, negg, 
negs, negt) 


Add Instructions 
(addf£, addg, adds, 
addt) 


Divide Instructions 
(divf£, divg, divs, 
divt) 


Multiply Instructions 
(mulf£, mulg, muls, 
mult) 


Subtract Instructions 
(subf£, subg, subs, 
subt) 


Conversion Between 
Integer Formats 
Instructions (cvtql, 
evtlq) 


Conversion from 
Floating-Point to 
Integer Format 
Instructions (cvtgq, 
evttq) 
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Clears the destination register. 


Computes the absolute value of the contents of 
the source register and puts the floating-point 
result in the destination register. 


Computes the negative value of the contents of 
s_reg or d_reg and puts the specified precision 
floating-point result in d_reg. 


Adds the contents of s_reg or d_reg to the contents of 
s_reg2 and puts the result in d_reg. When the sum of 
two operands is exactly zero, the sum has a positive sign 
for all rounding modes except round toward -I NF. For 
that rounding mode, the sum has a negative sign. 


Computes the quotient of two values. These instructions 
divide the contents of s_regi or d_reg by the 
contents of s_reg2 and put the results in d_reg. 

If the divisor is a zero, an error is signaled if the 
divide-by-zero exception is enabled. 


Multiplies the contents of s_ regi or d_reg with the 
contents of s_reg2 and puts the result in d_reg. 


Subtracts the contents of s_reg2 from the contents 
of s_ regi or d_reg and puts the result in d_reg. 
When the difference of two operands is exactly zero, 
the difference has a positive sign for all rounding 
modes except round toward -INF. For that rounding 
mode, the sum has a negative sign. 


Converts the integer contents of s_reg to the specified 
integer format and puts the result in d_reg. If an integer 
overflow occurs, the truncated result is stored in d_reg 
and, if enabled, an arithmetic trap occurs. 


Converts the floating-point contents of s_reg to the 
specified integer format and puts the result in d_reg. If 
an integer overflow occurs, the truncated result is stored 
in d_reg and, if enabled, an arithmetic trap occurs. 


Table 4—5: Arithmetic Instruction Descriptions (cont.) 


Instruction Description 
Conversion Converts the integer contents of s_reg to the specified 
from Integer to floating-point format and puts the result in d_reg. 


Floating-Point 
Format Instructions 
(cvtg£, cvtqg, cvtqs, 
evtqt) 


Conversion Between Converts the contents of s_reg tothe specified precision, 
Floating-Point round according to the rounding mode, and puts the result 
Formats Instructions in d_reg. If an overflow occurs, an unpredictable value is 
(cvtdg, evtgd, evtgf, storedin d_reg anda floating-point trap occurs. 


evtts, cvtst) 


4.4 Floating-Point Relational Instructions 


Floating-point relational instructions compare two floating-point values. 


Table 4-6 lists the mnemonics and operands for instructions that perform 
floating-point relational operations. Each of the instructions can take an 


operand in any of the forms shown. 


The Qualifiers column in Table 4-6 refers to one or more trap or rounding 


modes as specified in Table 4-1. 


Table 4-6: Relational Instruction Formats 


Instruction Mnemonic Qualifiers Operands 

Compare G_floating Equal cmpgeq 3 s_regl, s_reg2, d_regor 
d_reg/s_regl, s_reg2 

Compare G_floating Less Than cmpglt 3 

Compare G_ floating Less cmpgle 3 

Than or Equal 

Compare T_floating Equal cmpteq 4 

Compare T_floating Less Than emptlt 4 

Compare T_floating Less cmptle 4 

Than or Equal 

Compare T_floating Unordered cmptun 4 


Table 4-7 describes the relational instructions supported by the assembler. 
The relational instructions are grouped by function. See Table 4-6 for the 


instruction names. 
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Table 4-7: Relational Instruction Descriptions 


Instruction Description 


Compare Equal Instructions Compare the contents of s_reg1 with the contents of s_reg2. If 

(cmpgeq, cmpteq) s_regl1 equals s_reg2, anonzero value is written to the destination 
register; otherwise, a true zero value is written to the destination. 
Exceptions are not signaled for unordered values. 


Compare Less Than Compare the contents of s_reg1 with the contents of s_reg2. If 

Instructions (cmpglt, cmptlt) s_regi1is less than s_reg2, a nonzero value is written to the 
destination register; otherwise, a true zero value is written to the 
destination. Exceptions are not signaled for unordered values. 


Compare Less Than or Equal Comparethe contents of s_ regi with the contents of s_reg2. If 

Instructions (cmpgle, cmptle) s_reg1isless than or equal to s_reg2, a nonzero value is written to 
the destination register; otherwise, a true zero value is written to the 
destination. Exceptions are not signaled for unordered values. 


Compare Unordered Compare the contents of s_reg1 with the contents of s_reg2. If either 

Instruction (cmptun) s_regl1 Or s_reg2 is unordered, a nonzero value is written to the 
destination register; otherwise, a true zero value is written to the 
destination. Exceptions are not signaled for unordered values. 


4.5 Floating-Point Move Instructions 


Floating-point move instructions move data between floating-point registers. 


Table 4-8 lists the mnemonics and operands for instructions that perform 
floating-point move operations. The table is divided into groups of 
functionally related instructions. The operands specified within a particular 
group apply to all of the instructions contained in that group. 


Table 4—8: Move Instruction Formats 


Instruction Mnemonic Operands 

Floating Move Emov s reg, d_reg 

Copy Sign cpys s regl, s_reg2, d_reg 
Copy Sign Negate ae or d_reg/s_regl, s_reg2 
Copy Sign and Exponent cpyse 

Move If Equal to Zero fcmoveg 

Move If Not Equal to Zero fcmovne 

Move If Less Than Zero fcmovlt 

Movelf Less Than or Equal to Zero £cmovle 

Move If Greater Than Zero fomovgt 


Move lf Greater Than or Equal to Zero fcmovge 
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Table 4-9 describes the operations performed by move instructions. The 
move instructions are grouped by function. See Table 4-8 for the instruction 
names. 


Table 4-9: Move Instruction Descriptions 


Instruction Description 

Move Instruction Moves the contents of s_reg to d_reg. 

(f£mov) 

Copy Sign Fetches the sign bit of s_reg1 or d_reg, combines 


Instruction (cpys) it with the exponent and fraction of s_reg2, and 
copies the result to d_reg. 


Copy Sign Negate Fetches the sign bit of s regi or d_reg, complements 
Instruction (cpysn) it, combines it with the exponent and fraction of 
s_reg2, and copies the result to d_reg. 


Copy Sign Fetches the sign and exponent of s_regi1 or d_reg, 
and Exponent combines them with the fraction of s_reg2, and 
Instruction (cpyse) copies the result to d_reg. 


Move If Instructions Compares the contents of s_reg1 or d_reg against zero. 
(fcmoveg, fcmovne, If the specified condition is true, the contents of s_reg2 
fomovlt, femovle, are copied to d_reg; otherwise, d_reg is unchanged. 


fomovgt, fcmovge) 


4.6 Floating-Point Control Instructions 


Floating-point control instructions test floating-point registers and 
conditionally branch. 


Table 4-10 lists the mnemonics and operands for instructions that perform 
floating-point control operations. The specified operands apply to all of the 
instructions listed in the table. 


Table 4-10: Control Instruction Formats 


Instruction Mnemonic Operands 
Branch Equal to Zero fbge s reg, label 
Branch Not Equal to Zero fbne 

Branch Less Than Zero fblt 

Branch Less Than or Equal to Zero fble 

Branch Greater Than Zero fbgt 

Branch Greater Than or Equal to Zero fbge 
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Table 4-11 describes the operations performed by control instructions. The 
control instructions are grouped by function. See Table 4-10 for instruction 
names. 


Table 4-11: Control Instruction Descriptions 


Instruction Description 

Branch Instructions (fbeg, The contents of the source register are compared 
fbne, fblt, fble, fbgt, with zero. If the specified relationship is true, a 
fbge) branch is made to the specified label. 


4.7 Floating-Point Special-Purpose Instructions 


Floating-point special-purpose instructions perform miscellaneous tasks. 


Table 4-12 lists the mnemonics and operands for instructions that perform 
floating-point special-purpose operations. 


Table 4-12: Special-Purpose Instruction Formats 


Instruction Mnemonic Operands 
Move from FP Control Register mf_fpcr d_ reg 
Move to FP Control Register mt_fper s_reg 
No Operation fnop (none) 


Table 4-13 describes the operations performed by floating-point 
special-purpose instructions. 


Table 4-13: Control Register Instruction Descriptions 


Instruction Description 


Move to FPCR Copies the value in the specified source register to 
Instruction (m£_fpcr) the floating-point control register (F PCR). 


Move from FPCR Copies the value in floating-point control register 
Instruction (mt_fpcr) (FPCR) to the specified destination register. 


No Operation This instruction has no effect on the machine state. 
Instruction (fnop) 
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Assembler Directives 


Assembler directives are instructions to the assembler to perform various 
bookkeeping tasks, storage reservation, and other control functions. To 
distinquish them from other instructions, directive names begin with a 
period. Table 5-1 lists the assembler directives by category. 


Table 5-1: Summary of Assembler Directives 
Category Directives 


Compiler-Use-Only Directives -err 
.file 
.lab 
.loc 
-option 


Location Control Directives .align 
.data 
.lit4 
.1its 
.rconst 
.rdata 
.sdata 
.space 
. text 
.tlsdata 


Symbol Declaration Directives .extern 
-globl 
-struct 
symbolic equate 
.weakext 


Routine Entry Point Definition Directives -aent 
ent 
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Table 5-1: Summary of Assembler Directives (cont.) 


Category Directives 


Data Storage Directives -ascii 
.asciiz 
. byte 
. comm 
.double 
.d_ floating 
.extended 
. float 
.£ floating 
-gprel32 
.g floating 
. Lcomm 
. Long 
-quad 
.s floating 
.tlscomm 
.tlslcomm 
.t_floating 
. word 
.x floating 


Repeat Block Directives .endr 
.repeat 


Assembler Option Directive .set 


Procedure Attribute Directives .edata 
.eflag 
.end 
.fmask 
. frame 
-mask 
. prologue 
.Save_ ra 


Version Control Directive .ident 
.verstamp 


Scheduling and Architecture Subset -arch 
Directives . tune 


The following list contains descriptions of the assembly directives (in 
alphabetical order): 


saent name 


Sets an alternate entry point for the current procedure. Use this 
information when you want to generate information for the debugger. 
This directive must appear between a pair of .ent and . end directives. 
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salign expression 


Sets low-order bits in the location counter to zero. The value of 
expression establishes the number of bits to be set to zero. The 
maximum value for expression is 16 (which produces 64K alignment). 


If the .align directive advances the location counter, the assembler 
fills the skipped bytes with zeros in data sections and nop instructions 
in text sections. 


Normally, the .word, .long, .quad, .float, .double, .extended, 
.d_floating, .f floating, .g floating, .s floating, 

.t_ floating, and .x floating directives automatically align their 
data appropriately. For example, . word does an implicit .align 1, 
and .double does animplicit .align 3. 


You can disable the automatic alignment feature with .align 0. The 
assembler reinstates automatic alignment at the next .text, .data, 
.rdata, or .sdata directive that it encounters. 


Labels immediately preceding an automatic or explicit alignment are 
also realigned. For example: 


foo: .align 3 
-word 0 


This is equivalent to: 


-align 3 
foo: .word 0 


zarch model 


Specifies the version of the Alpha architecture that the assembler is 

to generate instructions for. The valid values for model are identical 

to those you can specify with the —arch flag on the cc command line. 
See cc(1) for details. 


ascii string[, string]... 


Assembles each string from the list into successive locations. The 
.ascii directive does not pad the string with null characters. You 
must put double quotation marks (") around each string. You can 
optionally use the backslash escape characters. For a list of the 
backslash characters, see Section 2.4.3. 


vasciiz string[, string]... 


Assembles each string in the list into successive locations and adds a 
null character. You can optionally use the backslash escape characters. 
For a list of the backslash characters, see Section 2.4.3. 
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ébyte expression [,expression2] [expressionN] 


Truncates the values of the expressions specified in the 
comma-separated list to 8-bit values, and assembles the values in 
successive locations. The values of the expressions must be absolute. 


The operands for the . byte directive can optionally have the following 
form: 


expressionVal[: expressionRep ] 


The expressionVal is an 8-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVa1) 
and repetition count (expressionRep) must be absolute. 


comm name, expressioni[,expression2] 


Unless defined elsewhere, name becomes a global common symbol at 
the head of a block of at least expression1 bytes of storage. The 
linker overlays like-named common blocks, using the expression value 
of the largest block as the byte size of the overlay. The expression2 
operand has the same effect on alignment as the operand for the 
.align directive. 


data 
Directs the assembler to add all subsequent data to the . data section. 


.d_ floating expressioni [,expression2][expressionN] 


Initializes memory to double-precision (64-bit) VAX D_floating 
numbers. The values of the expressions must be absolute. 


The operands for the .d_ floating directive can optionally have the 
following form: 


expressionVal[: expressionRep ] 


The expressionVal iS a 64-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The .d_floating directive automatically aligns its data and any 
preceding labels on a double-word boundary. You can disable this 
feature with the .align 0 directive 


double expression! [,expression2] [expressionN] 


Synonym for .t_ floating. 
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ata 0 
ata 1 lang-handler relocatable-expression 
.edata 2 lang-handler constant-expression 


Marks data related to exception handling. 


If the flag is 0, the assembler adds all subsequent data tothe .xdata 
section. 


If the flag is 1 or 2, the assembler creates a function table 
entry for the next .ent directive. The function table entry 
contains the language-specific handler (lang-handler) and data 
(relocatable-expression Of constant-expression). 


.eflag flags 


Encodes exception-related flags to be stored in the PDSC_RPD_ FLAGS 
field of the procedure’s run-time procedure descriptor. See the Calling 
Standard for Alpha Systems for a description of the individual flags. 


end [ proc_name |] 


Sets the end of a procedure. The . ent directive sets the beginning of 
a procedure. Usethe .ent and .end directives when you want to 
generate information for the debugger. 


.endr 


Signals the end of a repeat block. The . repeat directive starts a 
repeat block. 


ent proc _name[ lex-level ] 


Sets the beginning of the procedure proc_name. Usethis directive 
when you want to generate information for the debugger. The . end 
directive sets the end of a procedure. 


The lex-level operand indicates the number of procedures that 
statically surround the current procedure. This operand is only 
informational. It does not affect the assembly process; the assembler 
ignores it. 


err 


For use only by compilers.This directive causes the assembler to signal 
an error. Any compiler frontend that detects an error condition puts 
this directive in the input stream. When the assembler encounters a 
.err directive, it issues an error message and ceases to assemble the 
source file. This prevents the assembler from continuing to process a 
program that is incorrect. 
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extended expressioni [,expression2] [expressionN] 


Synonym for .x floating. 


.extern [(THREADS)] name [ number ] 


Indicates that the specified symbol is global and external; that is, the 
symbol is defined in another object module and cannot be defined until 
link time. The name operand is a global undefined symbol and number 
is the expected size of the external object. If the THREADS argument 

is specified, the symbol is treated as a tls (thread local storage) global 
undefined symbol. 


f floating expression! [,expression2] [expressionN] 


Initializes memory to single-precision (32-bit) VAX F_floating numbers. 
The values of the expressions must be absolute. 


The operands for the .£ floating directive can optionally have the 
following form: 


expressionVal[: expressionRep ] 


The expressionVal iS a 32-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The .f floating directive automatically aligns its data and 
preceding labels on a longword boundary. You can disable this feature 
by usingthe .align 0 directive 


file file number file _name_string 


For use only by compilers. Specifies the source file from which the 
assembly instructions that follow originated. This directive causes 

the assembler to stop generating line numbers that are used by the 
debugger. A subsequent . loc directive causes the assembler to resume 
generating line numbers. 


float expressioni [,expression2] [expressionN] 


Synonym for .s_ floating. 


fmask mask offset 


Sets a mask with a bit turned on for each floating-point register that 
the current routine saved. The least-significant bit corresponds to 
register $£0. The offset is the distance, in bytes, from the virtual 
frame pointer to where the floating-point registers are saved. 
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You must use .ent before . mask, and you can use only one . fmask 
for each .ent. Space should be allocated for those registers specified 
inthe .fmask. 


frame frame-reg frame-size return _pc-reg[local_offset] 


Describes a stack frame. The first register is the frame register, 
andframe-size is the size of the stack frame, that is, the number of 
bytes between the frame register and the virtual frame pointer. The 
second register specifies the register that contains the return address. 
The local_offset parameter, which is for use only by compilers, 
specifies the number of bytes between the virtual frame pointer and 
the local variables. 


You must use .ent before . frame, and you can use only one . frame 
for each .ent. No stack traces can be done in the debugger without 
the . frame directive. 


.g_ floating expression! [,expression2][expressionN] 


Initializes memory to double-precision (64-bit) VAX G_floating 
numbers. The values of the expressions must be absolute. 


The operands for the .g_ floating directive can optionally have the 
following form: 


expressionVal[: expressionRep ] 


The expressionVal iS a 64-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The .g_ floating directive automatically aligns its data and any 
preceding labels on a quadword boundary. You can disable this feature 
with the .align 0 directive 


globl name 


Identifies name as an external symbol. If the name is otherwise defined 
(for example, by its appearance as a label), the assembler exports the 
symbol; otherwise, it imports the symbol. In general, the assembler 
imports undefined symbols; that is, it gives them the UNIX storage 
class “global undefined” and requires the linker to resolve them. 


-gprel32 addressi[, address2] [,addressN] 


Truncates the signed displacement between the global pointer value 
and the addresses specified in the comma-separated list to 32-bit 
values, and assembles the values in successive locations. 
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The operands for the .gpre132 directive can optionally have the 
following form: 


addressVal[: addressRep ] 


The addressVal is the address value. The optional addressRep is 
a non-negative expression that specifies how many times to replicate 
the value of addressVal. The expression value (addressVa1) and 
repetition count (addressRep) must be absolute. 


The .gpre132 directive automatically aligns its data and preceding 
labels on a longword boundary. You can disable this feature with the 
.align 0 directive 


ident string 


Allows the specification of a string that the assembler stores in the .o 
file created during assembly. This string can be searched for ina .o 
file or in an executable using the what(1) command. 


Jab label name 


For use only by compilers. Associates a named label with the current 
location in the program text. 


comm name, expressioni[,expression2] 


Gives the named symbol (name) a data type of bss. The assembler 
allocates the named symbol to the bss area, and expression1 defines 
the named symbol’s length. If a .glob1 directive also specifies the 
name, the assembler allocates the named symbol as an external 
symbol. The expression2 operand has the same effect on alignment 
as the operand for the .align directive. If expression2 is not 
specified, the alignment defaults to quadword alignment. 


The assembler puts bss symbols in one of two bss areas. If the defined 
size is less than or equal to the size specified by the assembler or 
compiler’s -G command-line option, the assembler puts the symbols in 
the sbss area. 


Ait4 


Allows 4-byte constants to be generated and placed in the 11t4 section. 
This directiveis only valid for . long (with nonrelocatable expressions), 
.£ floating, .float,and .s floating. 
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itd 


Allows 8-byte constants to be generated and placed in the 1it8 section. 
This directive is only valid for . quad (with nonrelocatable expressions), 
.d_ floating, .g floating, .double, and .t_ floating. 


doc file number line number 


For use only by compilers. Specifies the source file and the line within 
it that corresponds to the assembly instructions that follow. The 
assembler ignores the file number when this directive appears in the 
assembly source file. Then, the assembler assumes that the directive 
refers tothe most recent . file directive 


‘long expression1 [,expression2] [expressionN] 


Truncates the values of the expressions specified in the 
comma-separated list to 32-bit values, and assembles the values in 
successive locations. The values of the expression can be relocatable. 


The operands for the . long directive can optionally have the following 
form: 


expressionVal[: expressionkep |] 


The expressionVal iS a 32-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The . long directive automatically aligns its data and preceding labels 
on a longword boundary. You can disable this feature with the .align 
0 directive. 


«mask mask, offset 


Sets a mask with a bit turned on for each general-purpose register 
that the current routine saved. The least significant bit corresponds 
toregister $0. The offset is the distance, in bytes, from the virtual 
frame pointer to where the registers are saved. 


You must use .ent before .mask, and you can use only one .mask 
for each .ent. Space should be allocated for those registers specified 
inthe .mask. 


option options 


For use only by compilers. Instructs the assembler to replace an 
optimization level that was specified on the command with the 
one specified in the options argument. Valid entries for this new 
optimization level are o, 01 - 04. 
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-prologue flag 
Marks the end of the prologue section of a procedure. 


A flag of 0 indicates that the procedure does not use $gp; the caller 
does not need to set up $pv prior to calling the procedure or restore 
$gp on return from the procedure. 


A flag of 1 indicates that the procedure does use $gp; the caller must 
set up S$pv prior to calling the procedure and restore $gp on return 
from the procedure. 


If flag is not specified, the behavior is as if a value of 1 was specified. 


‘quad expressioni [,expression2] |expressionN] 


Truncates the values of the expressions specified in the 
comma-separated list to 64-bit values, and assembles the values in 
successive locations. The values of the expressions can be relocatable. 


The operands for the . quad directive can optionally have the following 
form: 


expressionVal[: expressionRep ] 


The expressionVal iS a 64-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The . quad directive automatically aligns its data and preceding labels 
on a quadword boundary. You can disable this feature with the .align 
0 directive. 


&rconst 


Instructs the assembler to add subsequent data into the .rconst 
section. (This is the same as the . rdata directive except that the 
entries cannot be relocatable.) 


.rdata 


Instructs the assembler to add subsequent data into the .rdata 
section. 


repeat expression 


Repeats all instructions or data between the . repeat and .endr 
directives. The expression defines how many times the enclosing text 
and data repeats. With the . repeat directive, you cannot use labels, 
branch instructions, or values that require relocation in the block. Also 
note that nesting . repeat directives is not allowed. 
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Save_fa saved _ra_register 


Specifies that saved _ra_ register is theregister in which the return 
address is saved during the execution of the procedure. If .save_rais 
not used, the saved return address register is assumed to be the same 
as the return_pc_ register argument of the frame directive. The 
.save_ra directive is valid only for register frame procedures. 


sdata 


Instructs the assembler to add subsequent data to the . sdata section. 


set option 


Instructs the assembler to enable or disable certain options. The 
assembler has the following default options: reorder, macro, move, 
novolatile, and at. Only one option can be specified by a single .set 
directive. The effects of the options are as follows: 


The reorder option permits the assembler to reorder 
machine-language instructions to improve performance. 


The noreorder option prevents the assembler from reordering 
machine-language instructions. If a machine-language instruction 
violates the hardware pipeline constraints, the assembler issues a 
warning message. 


The macro option permits the assembler to generate multiple 
machine-language instructions from a single assembler instruction. 


The nomacro option causes the assembler to print a warning 
whenever an assembler operation generates more than one 
machine-language instruction. You must select the noreorder 
option before using the nomacro option; otherwise, an error results. 


The at option permits the assembler to use the Sat register for 
macros, but generates warnings if the source program uses Sat. 


When you use the noat option and an assembler operation requires 
the sat register, the assembler issues a warning message; however, 
the noat option does permit source programs to use Sat without 
warnings being issued. 


The nomove options instructs the assembler to mark each 
subsequent instruction so that it cannot be moved during 
reorganization. The assembler can still move instructions from 
below the nomove region to above the region or vice versa. The 
nomove option has part of the effect of the “volatile” C declaration; 
it prevents otherwise independent loads or stores from occurring in 
a different order than intended. 


The move option cancels the effect of nomove. 
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¢ Thevolatile option instructs the assembler that subsequent load 
and store instructions may not be moved in relation to each other 
or removed by redundant load removal or other optimization. The 
volatile option is less restrictive than noreorder; it allows the 
assembler to move other instructions (that is, instructions other 
than load and store instructions) without restrictions. 


The novolatile option cancels the effect of the volatile option. 


S floating expression! [,expression2] [expressionN] 


Initializes memory to single-precision (32-bit) IEEE floating-point 
numbers. The values of the expressions must be absolute. 


The operands for the .s_ floating directive can optionally have the 
following form: 


expressionVal [: expressionRep ] 


The expressionVal is a 32-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVa1) 
and repetition count (expressionRep) must be absolute. 


The .s_ floating directive automatically aligns its data and 
preceding labels on a longword boundary. You can disable this feature 
with the .align 0 directive 


space expression 


Advances the location counter by the number of bytes specified by the 
value of expression. The assembler fills the space with zeros. 


struct expression 


Permits you to lay out a structure using labels plus directives such as 
.word or .byte. It ends at the next segment directive(.data, .text, 
and so forth). It does not emit any code or data, but defines the labels 
within it to have values that are the sum of expression plus their 
offsets from the . struct itself. 


symbolic equate 


Takes one of the following forms: name = expression or 

name = register. You must define the name only once in the 
assembly, and you cannot redefine it. The expression must be 
computable when you assemble the program, and the expression must 
involve only operators, constants, or equated symbols. You can use the 
name as a constant in any later statement. 
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text 


Instructs the assembler to add subsequent code to the . text section. 
(This is the default.) 


tlscomm name,expression 


The name operand becomes a global tls common symbol at the head of a 
block of expression bytes of storage. This directive is analogous to 
the . comm directive. 


tisdata 


Directs the assembler to add all subsequent data tothe .tlsdata 
section. This directive is analogous tothe . data directive. 


tlslcomm name,expression 


The name operand becomes a symbol of type tlsbss. The assembler 
allocates the symbol to the tlsbss section and the expression defines the 
named symbol’s length. If a .glob1 directive also specifies the symbol 
name, the assembler allocates the named symbol as an external symbol. 


Unlike non-tls symbols, thread local storage’s bss data is allocated in 
only one area. There is no sbss area for tls symbols. This directive is 
analogous to the .1comm directive. 


.tlslcomm b 8 /* TlsBss stStatic */ 
. lcomm B 8 /* SBss stStatic */ 
-globl c /* TlsBss stGlobal */ 
-tlslcomm c 8 
-globl ia: /* SBss stGlobal */ 
. Lcomm Cc 8 


floating expression! [,expression2] [expressionN] 


Initializes memory to double-precision (64-bit) IEEE floating-point 
numbers. The values of the expressions must be absolute. 


The operands for the .t_ floating directive can optionally have the 
following form: 


expressionVal[: expressionRep ] 


The expressionVal iS a 64-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVa1) 
and repetition count (expressionRep) must be absolute. 


The .t_ floating directive automatically aligns its data and any 
preceding labels on a quadword boundary. You can disable this feature 
with the .align 0 directive 
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ztune option 


Selects processor-specific instruction tuning for various 
implementations of the Alpha architecture. Regardless of the setting 
of the . arch directive, the generated code will run correctly on all 
implementations of the Alpha architecture. The valid values for 
option are identical to those you can specify with the —arch flag on 
the cc command line. See cc(1) for details. 


verstamp major minor 


Specifies the major and minor version numbers; for example, version 
0.15 would be .verstamp 0 15. 


:weakext name [,name2] 


Sets name1 to be a weak symbol during linking. If name2 is specified, 
name1 is created as a weak symbol with the same value as name2. 
Weak symbols can be silently redefined at link time 


-word expressioni [,expression2] [expressionN] 


Truncates the values of the expressions specified in the 
comma-separated list to 16-bit values, and assembles the values in 
successive locations. The values of the expressions must be absolute. 


The operands for the . word directive can optionally have the following 
form: 


expressionVal[: expressionRep ] 


The expressionVal isa 16-bit value. Theoptional expressionRep 
is a non-negative expression that specifies how many times to replicate 
the value of expressionVal. The expression value (expressionVal1) 
and repetition count (expressionRep) must be absolute. 


The . word directive automatically aligns its data and preceding labels 
on a word boundary. You can disable this feature with the .align 
0 directive. 


X_floating expression1 [,expression2][expressionN] 


Initializes memory to quad-precision (128-bit) IEEE floating-point 
numbers. The values of the expressions must be absolute. 


The operands for the .x_ floating directive can optionally have the 
following form: 


expressionVal[: expressionkRep |] 


The expressionVal is a 128-bit value. The optional expressionRep 
is a non-negative expression that specifies how many times to replicate 
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the value of expressionVal. The expression value (expressionVa1) 
and repetition count (expressionRep) must be absolute. 


The .x_ floating directive automatically aligns its data and 
preceding labels on an octaword boundary. You can disable this feature 
with the .align 0 directive 
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Programming Considerations 


This chapter gives rules and examples to follow when creating an assembly 
language program. 


The chapter addresses the following topics: 


e« Why your assembly programs should use the calling conventions 
observed by the C compiler (Section 6.1) 


¢« An overview of the composition of executable programs (Section 6.2) 


¢« The use of registers, section and location counters, and stack frames 
(Section 6.3) 


¢ A technique for coding an interface between an assembly language 
procedure and a procedure written in a high-level language (Section 6.4) 


« The default memory-allocation scheme used by the Alpha system 
(Section 6.5) 


This chapter does not address coding issues related to performance or 
optimization. See Appendix A of the Alpha Architecture Reference M anual 
for information on how to optimize assembly code. 


6.1 Calling Conventions 


When you write assembly language procedures, you should use the same 
calling conventions that the C compiler observes. The reasons for using the 
same calling conventions are as follows: 


e Often your code must interact with compiler-generated code, accepting 
and returning arguments or accessing shared global data. 


« Thesymbolic debugger gives better assistance in debugging programs 
that use standard calling conventions. 


The conventions observed by the Tru64 UNIX compiler system are more 
complicated than those of some other compiler systems, mostly to enhance 
the speed of each procedure call. Specifically: 


« TheC compiler uses the full, general calling sequence only when 
necessary; whenever possible, it omits unneeded portions of the 
sequence. For example, the C compiler does not use a register as a frame 
pointer if it is unnecessary to do so. 
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« TheC compiler and the debugger observe certain implicit rules instead of 
communicating by means of instructions or data at execution time. For 
example, the debugger looks at information placed in the symbol table 
by a .frame directive at compilation time. This technique enables the 
debugger to tolerate the lack of a register containing a frame pointer at 
execution time. 


¢ The linker performs code optimizations based on information that is 
not available at compile time. For example, the linker can, in some 
cases, replace the general calling sequence to a procedure with a single 
instruction. 


6.2 Program Model 


A program consists of an executable image and zero or more shared images. 
Each image has an independent text and data area. 


Each data segment contains a global offset table (GOT), which contains 
address constants for procedures and data locations that the text segment 
references. The GOT provides the means to access arbitrary 64-bit addresses 
and allows the text segment to be position-independent. 


The size of the GOT is limited only by the maximum image size. However, 
because only 64 KB can be addressed by a single memory-format instruction, 
the GOT is segmented into one or more sections of 64 KB or less. 


In addition to providing efficient access to the GOT, the gp register is also 
used to access global data within +2 GB of the global pointer. This area of 
memory is Known as the global data area. 


A static executable image is not a special case in the program model. It is 
simply an executable image that uses no shared libraries. However, it is 
possible for the linker to perform code optimizations. In particular, if a static 
executable image’s GOT is less than or equal to 64 KB (that is, has only one 
segment), the code to load, save, and restore the gp register is not necessary 
because all procedures will access the same GOT segment. 


6.3 General Coding Concerns 
This section describes three general areas of concern to the assembly 
language programmer: 
e« Usable and restricted registers (Section 6.3.1) 
¢ Control of section and location counters with directives (Section 6.3.2) 


e Stack frame requirements on entering and exiting a procedure 
(Section 6.3.3) 
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Another general coding consideration is the use of data structures to 
communicate between high-level language procedures and assembly 
procedures. In most cases, this communication is handled by means of simple 
variables: pointers, integers, Booleans, and single: and double-precision real 
numbers. Describing the details of the various high-level data structures 
that can also be used — arrays, records, sets, and so on — is beyond the 
scope of this manual. 


6.3.1 Register Use 


The main processor has 32 64-bit integer registers. The uses and restrictions 
of these registers are described in Table 6-1. 


The floating-point coprocessor has 32 floating-point registers. Each register 
can hold either a single-precision (32 bit) or double-precision (64 bit) value. 
See Table 6-2 for details. 


Table 6-1: Integer Registers 


Register Name Software Name Use 
(from regdef.h) 


$0 vo Used for expression evaluations and to 
hold the integer function results. Not 
preserved across procedure calls. 


$1-8 tO-t7 Temporary registers used for expression 
evaluations. Not preserved across 
procedure calls. 


$9-14 s0-s5 Saved registers. Preserved across 
procedure calls. 
$15 or $fp sé or fp Contains the frame pointer (if needed); 


otherwise, a saved register. 


$16-21 ad-a5 Used to pass the first six integer type 
actual arguments. Not preserved 
across procedure calls. 


$22-25 t8-t1l Temporary registers used for expression 
evaluations. Not preserved across 
procedure calls. 


$26 ra Contains the return address. Preserved 
across procedure calls. 


$27 pv or t12 Contains the procedure value and 
used for expression evaluation. Not 
preserved across procedure calls. 


$28 Or Sat AT Reserved for the assembler. Not 
preserved across procedure calls. 
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Table 6-1: Integer Registers (cont.) 


Register Name Software Name Use 
(from regdef.h) 


$29 OF Sgp gp Contains the global pointer. Not 
preserved across procedure calls. 


$30 Or $sp sp Contains the stack pointer. Preserved 
across procedure calls. 


$31 zero Always has the value 0. 


Table 6-2: Floating-Point Registers 


Register Name Use 

$£0-f1 Used to hold floating-point type function results ($£0) and 
complex type function results ($£0 has the real part and $f1 
has the imaginary part). Not preserved across procedure calls. 

$£2-f£9 Saved registers. Preserved across procedure calls. 

$£10-f£15 Temporary registers used for expression evaluation. 
Not preserved across procedure calls. 

$f£16-f£21 Used to pass the first six single or double-precision actual 
arguments. Not preserved across procedure calls. 

$£22-£30 Temporary registers used for expression evaluations. 
Not preserved across procedure calls. 

$£31 Always has the value 0.0. 


6.3.2 Using Directives to Control Sections and Location Counters 


Assembled code and data are stored in the object file sections shown in 
Figure 6-1. Each section has an implicit location counter that begins at zero 
and increments by one for each byte assembled in the section. Location 
control directives (.align, .data, .rconst, .rdata, .sdata, .space, and 
. text) can be used to control what is stored in the various sections and 

to adjust location counters. 


The assembler always generates the text section before other sections. 
Additions to the text section are done in 4-byte units. 


The bss (block started by symbol) section holds data items (usually variables) 
that are initialized to zero. If a .1comm directive defines a variable, the 
assembler assigns that variable to either the .bss section or the .sbss 
(small bss) section, depending on the variable’s size. 


The default size for variables in the . sbss section is eight or fewer bytes. 
You can change the size using the -G compilation option for the C compiler 
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or the assembler. Items smaller than or equal tothe specified size goin the 
. sbss section. Items greater than the specified size goin the .bss section. 


At run time, the $gp register points into the area of memory occupied by the 
. Lita section. The .1ita section is used to hold address literals for 64-bit 


addressing. 


Figure 6-1: Sections and Location Counters for Nonshared Object Files 
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See the Symbol Table’ Object File Specification manual for more information 
on section data. (This manual is only available as an HTML document on 
the Tru64 UNIX Version 5.0 Documentation CD-ROM; it is not availablein 
hardcopy or in PS or PDF formats.) 
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6.3.3 The Stack Frame 


TheC compiler classifies each procedure into one of the following categories: 
¢ Nonleaf procedures. These procedures call other procedures. 


¢ Leaf procedures. These procedures do not themselves call other 
procedures. Leaf procedures are of two types: those that require stack 
storage for local variables and those that do not. 


You must decide the procedure category before determining the calling 
sequence. 


To write a program with proper stack frame usage and debugging 
capabilities, you should observe the conventions presented in the following 
list of steps. Steps 1 through 6 describe the code you must provide at the 
beginning of a procedure, step 7 describes how to pass parameters, and steps 
8 through 12 describe the code you must provide at the end of a procedure: 


1. Regardless of the type of procedure, you should includea . ent directive 
and an entry label for the procedure: 


ent procedure_name 
procedure name: 


The .ent directive generates information for the debugger, and the 
entry label is the procedure name. 


2. If you are writing a procedure that references static storage, calls other 
procedures, uses constants greater than 31 bits in size, or uses floating 
constants, you must load the $gp register with the global pointer value 
for the procedure: 


ldgp $gp,0 ($27) 


Register $27 contains the procedure value (the address of this procedure 
as supplied by the caller). 


3. If you are writing a leaf procedure that does not use the stack, skip to 
step 4. For a nonleaf procedure or a leaf procedure that uses the stack, 
you must adjust the stack size by allocating all of the stack space that 
the procedure requires: 


lda Ssp,-framesize (S$sp) 


The framesize operand is the size of frame required, in bytes, and 
must be a multiple of 16. You must allocate space on the stack for the 
following items: 


¢ Local variables. 


¢ Saved general registers. Space should be allocated only for those 
registers saved. For nonleaf procedures, you must save register $26, 
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which is used in the calls to other procedures from this procedure. If 
you use registers $9 to $15, you must also save them. 


¢ Saved floating-point registers. Space should be allocated only for 
those registers saved. If you use registers $£2 to $£9, you must 
also save them. 


¢ Procedure call argument area. You must allocate the maximum 
number of bytes for arguments of any procedure that you call from 
this procedure; this area does not include space for the first six 
arguments because they are always passed in registers. 


Note 


Once you have modified register $sp, you should not modify 
it again in the remainder of the procedure. 


To generate information used by the debugger and exception handler, 
you must includea . frame directive: 


.frame framereg, framesize,returnreg 


The virtual frame pointer does not have a register allocated for it. It 
consists of the framereg ($sp, in most cases) added to the framesize 
(see step 3). Figure 6-2 shows the stack components. 
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Figure 6-2: Stack Organization 
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The returnreg argument for the . frame directive specifies the 
register that contains the return address (usually register $26). The 
usual values may change if you use a varying stack pointer or are 
specifying a kernel trap procedure. 


5. If the procedure is a leaf procedure that does not use the stack, skip to 
step 11. Otherwise, you must save the registers for which you allocated 
space in step 3. 


Saving the general registers requires the following operations: 


¢ Specify which registers are to be saved using the following .mask 
directive: 
.-mask bitmask, frameoffset 


The bit settings in bitmask indicate which registers are to be saved. 
For example, if register $9 is to be saved, bit 9 in bitmask must be 
set tol. Thevalue for frameoffset is the offset (negative) from the 
virtual frame pointer to the start of the register save area. 


e Usethe following stq instruction to save the registers specified in 
the mask directive: 


stq reg, framesize+frameoffset+N(S$sp) 
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The value of nis the size of the argument build area for the first 
register and is incremented by 8 for each successive register. 

If the procedure is a nonleaf procedure, the return address 
register ($26) is the first register to be saved; it must be saved at 
framesize+frameoffset+0($sp) for exception handling. For 
example, a nonleaf procedure that saves register $9 and $10 would 
use the following stq instructions: 


stq $26, framesize+frameoffset ($sp) 
stq $9, framesize+frameoffset+8 ($sp) 
stq $10, framesize+frameoffset+16 ($sp) 


(Figure 6-2 shows the order in which the registers in the preceding 
example would be saved.) 


Then, save any floating-point registers for which you allocated space 
in step 3: 


.fmask bitmask, frameoffset 
stt reg, framesize+frameoffset+N(S$sp) 


Saving floating-point registers is identical to saving integer registers 
except you use the . fmask directive instead of .mask, and the 
storage operations involve single or double-precision floating-point 
data. (The previous discussion about how to save integer registers 
applies here as well.) 


The final step in creating the procedure’s prologue is to mark its end 
as follows: 


-prologue flag 


The flag is set to 1 if the prologue contains an 1dgp instruction (see 
step 2); otherwise, it is set to zero. 


This step describes parameter passing: how to access arguments 
passed into your procedure and how to pass arguments correctly to 
other procedures. For information on high-level, language-specific 
constructs (call-by-name, call-by-value, string or structure passing), see 
the programmer's guides for the high-level languages used to write the 
procedures that interact with your program. 


General registers $16 to $21 and floating-point registers $£16 to $f£21 
are used for passing the first six arguments. All nonfloating-point 
arguments in the first six arguments are passed in general registers. 
All floating-point arguments in the first six arguments are passed in 
floating-point registers. 


Stack space is used for passing the seventh and subsequent arguments. 
The stack space allocated to each argument is an 8-byte multiple and is 
aligned on an 16-byte boundary. 
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Table 6-3 summarizes the location of procedure arguments in the 
register or stack. 


Table 6-3: Argument Locations 


Argument Integer Register Floating-Point Stack 


Number Register 

1 $16 (ao) $f£16 

2 $17 (a1) $£17 

3 $18 (a2) $£18 

4 $19 (a3) $£19 

5 $20 (a4) $£20 

6 $21 (a5) $f£21 

7-n O($sp)..(n -7)*8(Ssp) 


8. On procedure exit, you must restore registers that were saved in step 5. 
To restore general purpose registers: 


ldq reg, framesize+frameoffset+N(S$sp) 


To restore the floating-point registers: 


ldt reg, framesize+frameoffset+N(S$sp) 
(See step 5 for a discussion of the value of N.) 
9. Get the return address: 


ldq $26, framesize+frameoffset ($sp) 
10. Clean up the stack: 


lda Ssp, framesize(S$sp) 
11. Return: 


ret $31, ($26),1 
12. End the procedure: 


.end procedurename 


6.3.4 Coding Examples 


The examples in this section show procedures written in C and the 
equivalent procedures written in assembly language. 


Example 6-1 shows a nonleaf procedure. Note that it creates a stack frame 
and saves its return address. It saves its return address because it must put 
a new return address into register $26 when it makes a procedure call. 
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Example 6-1: Nonleaf Procedure 


int 
nonleaf(i, j) 
int i, *j; 


{ 
int abs(); 
int temp; 
temp = i - *j; 
return abs(temp) ; 
} 
-globl nonleaf 
# 1 int 
# 2 nonleaf(i, 3) 
# 3 int i, *j; 
# 4 { 
.ent nonleaf 2 
nonleaf: 
ldgp $gp, 0($27) 
lda Ssp, -16(S$sp) 
stq $26, O(S$sp) 
-mask 0x04000000, -16 
.frame S$sp, 16, $26, 0 
- prologue 1 
addl $16, 0, $18 
# 5 int abs(); 
# 6 int temp; 
# a 
# 8 temp = i - *j; 
ldl $1, 0($17) 
subl $18, $1, $16 
# 9 return abs(temp) ; 
jsr $26, abs 
ldgp Sop, 0($26) 
ldq $26, O0(S$sp) 
lda Ssp, 16(Ssp) 
ret $31, ($26), 1 
.end nonleaf 


Example 6-2 shows a leaf procedure that does not require stack space for 
local variables. Note that it does not create a stackframe and does not savea 
return address. 
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Example 6-2: Leaf Procedure Without Stack Space for Local Variables 


int 
leaf(pl, p2) 
int. “pil, p2; 
{ 
return (pl > p2) ? pl : p2; 
} 
-gGlobl leaf 
# 1 leaf(pl, p2) 
# 2 int pl, p2; 
# 3 { 
.ent leaf 2 
leaf: 
ldgp $gp, 0($27) 
.frame Ssp, 0, $26, 0 
- prologue 1 
addl $16, 0, $16 
addl $17, 0, $17 
# 4 return (pl > p2) ? pl 
bis $17, $17, $0 
cmplt $0, $16, $1 
cmovne $1, $16, $0 
ret $31, ($26), 1 
.end leaf 


* “p23 


Example 6-3 shows a leaf procedure that requires stack space for local 
variables. Note that it creates a stack frame but does not save a return 


address. 


Example 6-3: Leaf Procedure with Stack Space for Local Variables 


int 


leaf storage (i) 


int i; 

{ 

int a[16]; 

int j; 

for (j = 
a[j] = '0' 

return a[il]; 


} 


0; j < 10; j++) 


Ji 


-globl leaf storage 
# 1 int 
# 2 leaf storage (i) 
# 3 int i; 
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Example 6-3: Leaf Procedure with Stack Space for Local Variables (cont.) 


# 4 { 

.ent leaf storage 2 
leaf_storage: 

ldgp $gp, 0($27) 

lda Ssp, -80(Ssp) 

.frame S$sp, 80, $26, 0 

.prologue 1 

addl $16, 0, $1 

# 5 int a[16]; 

# 6 int j; 

# 7 for (j = 0; j < 10; j++) 
ldil $2, 48 
stl $2, 16(Ssp) 
ldil $3, 49 
stl $3, 20(Ssp) 
ldil so, 
lda $16, 24(Ssp) 

$32: 

# 8 alj] = '0' + 3; 
addl $0, 48, $4 
stl $4, 0($16) 
addl $0, 49, $5 
stl $5, ($16) 
addl $0, 50, $6 
stl $6, ($16) 
addl $0, 51, $7 
stl $7, 12($16) 
addl $o, 4, $0 
addq $16, 16, $16 
subq $o, 10, $8 
bne $8, $32 

# 9 return a[il]; 
mull $1, 4, $22 
addq $22, Ssp, $0 
ldl $0, 16(S0) 
lda Ssp, 80(Ssp) 
ret $31, ($26), 1 
end leaf storage 


6.4 Developing Code for Procedure Calls 


The rules and parameter requirements for passing control and exchanging 
data between procedures written in assembly language and procedures 
written in other languages are varied and complex. The simplest approach 
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to coding an interface between an assembly procedure and a procedure 
written in a high-level language is to do the following: 


e« Usethe high-level language to write a skeletal version of the procedure 
that you plan to codein assembly language. 


* Compile the program using the -s option, which creates an assembly 
language (.s) version of the compiled source file. 


e Study the assembly language listing and then, using the code in the 
listing as a guideline, write your assembly language code. 


Section 6.4.1 and Section 6.4.2 describe techniques you can use to create 
interfaces between procedures written in assembly language and procedures 
written in a high-level language. The examples show what to look for in 
creating your interface. Details such as register numbers will vary according 
to the number, order, and data types of the arguments. In writing your 
particular interface, you should write and compile realistic examples of the 
code you want to write in assembly language. 


6.4.1 Calling a High-Level Language Procedure 


The following steps show an approach to use in writing an assembly 
language procedure that calls atof(3), a procedure written in C that converts 
ASCII characters to numbers: 


1. WriteaC program that calls atof. Pass global variables instead 
of local variables; this makes them easy to recognize in the assembly 
language version of the C program (and ensures that optimization does 
not remove any of the code on the grounds that it has no effect). 


The following C program is an example of a program that calls atof: 


char c[] = "3.1415"; 
double d, atof(); 
float £; 
caller () 

{ 

d atof(c); 


£ 


} 


2. Compile the program using the following compiler options: 


(float) atof (c); 


cc -S -O caller.c 


The -s option causes the compiler to produce the assembly language 
listing; the -o option, though not required, reduces the amount of code 
generated, making the listing easier to read. 
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3. After compilation, examine the file caller.s. The comments in the file 
show how the parameters are passed, the execution of the call, and how 
the returned values are retrieved: 


-globl c 
.data 


sascii "3.1415\x00" 
.comm d 8 
.comm £4 


,text 
-globl caller 

# 1 char c[] = "3.1415"; 

# 2 double d, atof(); 

# 3 float £; 

# 4 caller () 

# 5 { 
.ent caller 2 

caller: 
ldgp $gp, 0($27) 
lda Ssp, -16(Ssp) 
stq $26, 0($sp) 
-mask 0x04000000, -16 
.frame S$sp, 16, $26, 0 
. prologue BE 

# 6 d = atof(c); 
lda $16, ¢c 
jsr $26, atof 
ldgp Sgp, 0($26) 
stt sfo, d 

# es £f = (float) atof(c); 
lda $16, ¢c 
jsr $26, atof 
ldgp Sgp, 0($26) 
evtts S£0, $£10 
sts Sf10, £ 

# 8 } 
ldq $26, 0(S$sp) 
lda Ssp, 16(S$sp) 
ret $31, ($26), 1 
.end caller 


6.4.2 Calling an Assembly Language Procedure 


The following steps show an approach to usein writing an assembly language 
procedure that can be called by a procedure written in a high-level language: 


1. Usinga high-level language, write a facsimile of the assembly language 
procedure you want to call. In the body of the procedure, write 
statements that use the same arguments you intend to usein the final 
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assembly language procedure. Copy the arguments to global variables 
instead of local variables to make it easy for you to read the resulting 
assembly language listing. 


The following C program is a facsimile of the assembly language 
program: 


typedef char str[10]; 
typedef int boolean; 


float global _r; 
int global i; 

str global_s; 
boolean global_b; 


boolean callee(float *r, int i, str s) 


{ 


global _r 
global i = i; 
global_s[0] = s[0]; 
return i == 3; 


} 


2. Compile the program using the following compiler options: 


ll 
* 
KR 


cc -S -O callee.c 


The -s option causes the compiler to produce the assembly language 
listing; the -o option, though not required, reduces the amount of code 
generated, making the listing easier to read. 


3. After compilation, examine the file callee.s. The comments in the file 
show how the parameters are passed, the execution of the call, and how 
the returned values are retrieved: 


.comm global_r 4 
. comm global_i 4 
.comm global_s 10 
. comm global_b 4 


,text 
-globl callee 
# 10 { 
.ent callee 2 
callee: 
ldgp $gp, 0($27) 
.frame S$sp, 0, $26, 0 
.prologue 1 
addl $17, 0, $17 
# TA: global_r = *r; 
lds S£10, 0($16) 
sts $£10, global_r 
# 12 global i = i; 
stl $17, global_i 
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# 13 global_s[0] = s[0]; 
ldgu $1, 0($18) 
extbl $1, $18, $1 
.set noat 
lda $28, global_s 
ldgu $2, 0($28) 
insbl $1, $28, $3 
mskbl $2, $28, $2 


bis $2, $3, $2 
stqu $2, 0($28) 
.set at 

# 14 return: 4. == 33 
cmpeq $17, 3, $0 
ret $31, ($26), 1 
.end callee 


6.5 Memory Allocation 


The default memory allocation scheme used by the Alpha system gives 
every process two storage areas that can grow without bounds. A process 
exceeds virtual storage only when the sum of the two areas exceeds virtual 
storage space. By default, the linker and assembler use the scheme shown in 
Figure 6-3. 
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Figure 6-3: Default Layout of Memory (User Program View) 


OxXx£f£LEF £LLEL LLELELE fLff 

Reserved for kernel 
Oxff£f£ £c00 0000 0000 
Oxff££E£E EbE£E LLLL FLEE 


Not accessible 


0x0000 0400 0000 0000 


Ox0000 O3ff£ FELL FLEE A . 
Reserved for shared libraries 
Reserved for dynamic loader 
0x0000 O3f£ 8000 0000 
0x0000 O3fEF TEFL FELE 
Can be mapped by program 
Heap 
(grows up) 
bss segment 


$gp Data segment 
Text segment 
0x0000 0001 2000 0000 
0x0000 0001 1fff fFEF Stack 
(grows toward zero) 


Can be mapped by program 


Not accessible 
(by convention) 
(64 KB) 


$sp—> 


0x0000 0000 0001 0000 
0x0000 0000 0000 £fff 


0x0000 0000 0000 0000 


[4] | 


ZK-0738U-Al 


1. This area is not allocated until a user requests it. (The same behavior is 


observed in System V shared memory regions.) 


2. Theheap is reserved for sbrk and brk system calls, and it is not always 


present. 


3. SeetheSymbol Table/ Object File Specification manual for details on 
the sections contained within the bss, data, and text segments. (This 
manual is only available as an HTML document on the Tru64 UNIX 
Version 5.0 Documentation CD-ROM; it is not available in hardcopy 


or in PS or PDF formats.) 
4. Thestack is used for local data in C programs. 


6-18 Programming Considerations 


A 


Instruction Summaries 


The tables in this appendix summarize the assembly language instruction 


Set: 


¢ Table A-1 summarizes the main instruction set. 


¢« Table A-2 summarizes the floating-point instruction set. 


¢« Table A-3 summarizes the rounding and trapping modes supported by 
some floating-point instructions. 


Most of the assembly language instructions translate into singleinstructions 


in machine code. 


The tables in this appendix show the format of each instruction in the main 
instruction set and the floating-point instruction set. The tables list the 
instruction names and the forms of operands that can be used with each 
instruction. The specifiers used in the tables to identify operands have the 


following meanings: 


Operand Specifier 


Description 


address 


b_reg 


d_reg 


d_reg/s_reg 


label 
no_operands 


offset 


palcode 


s_reg, s_regl, s_reg2 


A symbolic expression whose effective value 
is used as an address. 


Base register. A register containing a base address 
to which is added an offset (or displacement) 
value to produce an effective address. 


Destination register. A register that receives 
a value as a result of an operation. 


One register that is used as both a destination 
register and a source register. 


A label that identifies a location in a program. 
No operands are specified. 


An immediate value that is added to the contents 
of a base register to calculate an effective address. 


A value that determines the operation performed 
by a PAL instruction. 


Source registers. Registers whose contents 
are to be used in an operation. 
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Operand Specifier 


Description 


val_expr 


val_immed 


jhint 


rhint 


An expression whose value is used as 
an absolute value. 


An immediate value that is to be used 
in an operation. 


An address operand that provides a hint of where 
a jmp or jsr instruction will transfer control. 


An immediate operand that provides software 
with a hint about how a ret or jsr_coroutine 
instruction is used. 


The tables in this appendix are segmented into groups of instructions that 
have the same operand options; the operands specified within a particular 
segment of the table apply to all of the instructions contained in that 


segment. 


Table A-1: Main Instruction Set Summary 


Instruction Mnemonic Operands 

Load Address lda? d reg, address 
Load Byte 1ldb 

Load Byte Unsigned 1dbu 

Load Word ldw 

Load Word Unsigned ldwu 

Load Sign Extended Longword 141? 

Load Sign Extended Longword Locked la1_1? 

Load Quadword ldq? 

Load Quadword Locked ldq_19 

Load Quadword Unaligned ldq_u? 

Load Unaligned Word uldw 

Load Unaligned Word Unsigned uldwu 

Load Unaligned Longword uldl 

Load Unaligned Quadword uldgq 

Store Byte stb s_reg, address 
Store Word stw 

Store Longword st19 

Store Longword Conditional stl_c? 

Store Quadword stq? 
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Table A-—1: Main Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 

Store Quadword Conditional stq_cé (See previous page) 

Store Quadword U naligned stq_u? 

Store Unaligned Word ustw 

Store Unaligned Longword ustl 

Store Unaligned Quadword ustq 

Load Address High ldah? d_reg, offset (b_reg) 

Load Global Pointer ldgp 

Load Immediate Longword 1dil d_reg,val_expr 

Load Immediate Quadword ldig 

Branch if Equal to Zero beg s_reg, label 

Branch if Not Equal to Zero bne 

Branch if Less Than Zero blt 

Branch if Less Than or Equal to Zero ble 

Branch if Greater Than Zero bgt 

Branch if Greater Than or Equal to Zero bge 

Branch if Low Bit is Clear blbc 

Branch if Low Bit is Set blbs 

Branch br d_reg, label or label 

Branch to Subroutine bsr 

J ump jmp? d_reg, (s_reg),jhint 
or d_reg, (s_reg) OF 

J ump to Subroutine jsxr@ (s_reg),jhint or (s_reg) or 
d_reg, address Of address 

Return from Subroutine ret 


J ump to Subroutine Return 


jsr_coroutine 


d_reg, (s_reg),rhint 

or d_reg, (s_reg) or 
d_reg,rhint or d_reg or 
(s_reg),rhint OF (s_reg) or 
rhint Of no_operands 


Architecture Mask amask s_reg,d_reg or 

val_immed,d_reg 
Clear clr d_reg 
Implementation Version implver 
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Table A—1: Main Instruction Set Summary (cont.) 


Instruction 


Mnemonic Operands 


Absolute Value Longword 


absl s_reg,d regord reg/s reg 
or val_immed,d_reg 


Absolute Value Quadword absq 
Move mov 
Negate Longword (without overflow) negl 
Negate Longword (with overflow) neglv 
Negate Quadword (without overflow) negq 
Negate Quadword (with overflow) negqv 
Logical Complement (NOT) not 
Sign-E xtension Byte sextb 
Sign-E xtension Longword sextl 
Sign-E xtension Word sextw 
Add Longword (without overflow) addl s_regl,s_reg2,d_reg or 
Add Longword (with overflow) agai ee eer - 
Add Quadword (without overflow) addq LEO Rog taal ames 
Add Quadword (with overflow) addqv 
Scaled Longword Add by 4 s4addl 
Scaled Quadword Add by 4 s4addq 
Scaled Longword Add by 8 s8addl 
Scaled Quadword Add by 8 s8addq 
Compare Signed Quadword E qual cmpeq 
Compare Signed Quadword Less Than emplt 
Compare Signed Quadword Less cmple 
Than or Equal 

Compare U nsigned Quadword Less Than cmpult 
Compare Unsigned Quadword Less cmpule 
Than or Equal 

Multiply Longword (without overflow) mull 
Multiply Longword (with overflow) mullv 
Multiply Quadword (without overflow) mulg 
Multiply Quadword (with overflow) mulqv 
Subtract Longword (without overflow) subl 
Subtract Longword (with overflow) sublv 
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Table A-—1: Main Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 
Subtract Quadword (without overflow) subq (See previous page) 
Subtract Quadword (with overflow) subqv 
Scaled Longword Subtract by 4 s4subl 
Scaled Quadword Subtract by 4 s4subq 
Scaled Longword Subtract by 8 s8subl 
Scaled Quadword Subtract by 8 s8subq 
Scaled Quadword Subtract by 8 s8subq 
Unsigned Quadword Multiply High umulh 
Divide Longword divl 
Divide Longword Unsigned divlu 
Divide Quadword divg 
Divide Quadword Unsigned divqu 
Longword Remainder reml 
Longword Remainder Unsigned remlu 
Quadword Remainder remq 
Quadword Remainder Unsigned remqu 
Logical Product (AND) and 
Logical Sum (OR) bis 
Logical Sum (OR) or 
Logical Difference (XOR) xor 
Logical Product with Complement (ANDNOT) bic 
Logical Product with Complement (ANDNOT) andnot 
Logical Sum with Complement (ORNOT) ornot 
Logical Equivalence (KORNOT) eqv 
Logical Equivalence (KORNOT) xornot 
Move if Equal to Zero cmoveg 
Move if Not Equal to Zero cmovne 
Move if Less Than Zero cmovlt 
Move if Less Than or Equal to Zero cmovle 
Move if Greater Than Zero cmovgt 
Move if Greater Than or Equal to Zero cmovge 
Move if Low Bit Clear cmovlbc 
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Table A-—1: Main Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 
Move if Low Bit Set cmovlbs (See previous page) 
Shift Left Logical sll 

Shift Right Logical srl 

Shift Right Arithmetic sra 

Compare Byte cmpbge 

Extract Byte Low extbl 

Extract Word Low extwl 

Extract Longword Low ext1l 

Extract Quadword Low extql 

Extract Word High extwh 

Extract Longword High extlh 

Extract Quadword High extqh 

Insert Byte Low insbl 

Insert Word Low inswl 

Insert Longword Low insll 

Insert Quadword Low insql 

Insert Word High inswh 

Insert Longword High inslh 

Insert Quadword High insgh 

Mask Byte Low mskbl 

Mask Word Low mskwl 

Mask Longword Low msk1l1l 

Mask Quadword Low mskql 

Mask Word High mskwh 

Mask Longword High msklh 

Mask Quadword High mskgh 

Zero Bytes zap 

Zero Bytes NOT zapnot 

Call Privileged Architecture Library call_pal palcode 
Prefetch Data fetch offset (b_reg) 
Prefetch Data, Modify Intent fetch_m 

Read Process Cycle Counter rpcc d regord_reg, reg 
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Table A-—1: Main Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 

No Operation nop no_operands 
Universal No Operation unop 

Trap Barrier trapb 

Exception Barrier excb 

Memory Barrier mb 

Write Memory Barrier wmb 

Count Leading Zeros ctlz s reg, d_reg 
Count Population ctpop 

Count Trailing Zeros cttz 


9 In addition to the normal operands that can be specified with this instruction, relocation operands can also be specified 
(see Section 2.6.4). 


A number of the floating-point instructions in Table A-2 support qualifiers 
that control rounding and trapping modes. Table notes identify the qualifiers 
that can be used with a particular instruction. (The notes also identify the 
instructions on which relocation operands can be specified.) 


Qualifiers are appended as suffixes to the particular instructions that 
support them; for example, the instruction cvtdg with the sc qualifier 
would be coded cvtdgsc. 


The qualifier suffixes consist of one or more characters, with each character 
identifying a particular rounding or trapping mode. Table A-3 defines the 
rounding or trapping modes associated with each character. 


Table A-2: Floating-Point Instruction Set Summary 


Instruction Mnemonic Operands 
Load F_Floating laf? d_reg, address 
Load G_ Floating (Load D_Floating) ldg@ 

Load S Floating (Load Longword) lds? 

Load T_Floating (Load Quadword) lat? 

Store F_Floating stf? s_reg,address 
Store G_ Floating (Store D_Floating) stg@ 

Store S_Floating (Store Longword) sts? 

Store T_Floating (Store Quadword) stt4 

Load Immediate F_Floating ldif d_reg, val_expr 
Load Immediate D_Floating ldid 


Instruction Summaries A-7 


Table A-2: Floating-Point Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 

Load Immediate G_Floating ldig 

Load Immediate S_ Floating ldis 

Load Immediate T_Floating ldit (See previous page) 

Branch Equal to Zero fbeg s reg, label or label 

Branch Not Equal to Zero fbne 

Branch Less Than Zero fblt 

Branch Less Than or Equal to Zero fble 

Branch Greater Than Zero fbgt 

Branch Greater Than or Equal to Zero flbge 

Floating Clear felr d_reg 

Floating Move fmov s reg, d reg or 

Floating Negate fneg d_reg/s_reg 

Floating Absolute Value fabs 

Negate F_Floating neg£ 

Negate G_Floating negg 

Negate S Floating negs© 

Negate T_Floating negt© 

Copy Sign cpys s_regl, s_reg2, d_reg 
or d_reg/s_regl, 
s reg2 

Copy Sign Negate cpysn 

Copy Sign and Exponent cpyse 

Move if Equal to Zero fcmoveg 

Move if Not Equal to Zero fcmovne 

Move if Less Than Zero fcmovlt 

Move if Less Than or Equal to Zero fcmovle 

Move if Greater Than Zero fomovgt 

Move if Greater Than or Equal to Zero fcmovge 

Add F_Floating adatd 

Add G Floating addgd 

Add S_ Floating adds® 

Add T_Floating adat® 

Compare G_ Floating Equal empgeq? 
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Table A-2: Floating-Point Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 
Compare G_Floating Less Than empgltP 

Compare G_Floating Less Than or Equal empgle? 

Compare T_Floating Equal empteg® (See previous page) 
Compare T_Floating Less Than empt1t® 

Compare T_Floating Unordered emptun® 

Compare T_Floating Less Than or Equal emptle® 

Divide F_Floating dived 

Divide G Floating divgd 

Divide S Floating divs® 

Divide T_Floating divt® 

Multiply F_Floating mul£4 

Multiply G_Floating mulg? 

Multiply S Floating muls® 

Multiply T_Floating mult® 

Subtract F_Floating sub£4 

Subtract G_Floating subg! 

Subtract S Floating subs® 

Subtract T_Floating subt& 

Convert Quadword to Longword evtqit s reg, d_regor 
Convert Longword to Quadword evtlgq d_reg/s_reg 
Convert G_Floating to Quadword evtgq9 

Convert T_Floating to Quadword evttgh 

Convert Quadword to F_Floating evtatl 

Convert Quadword to G_Floating evtgg! 

Convert Quadword to S Floating evtgsl 

Convert Quadword to T_Floating evt atl 

Convert D_Floating to G_Floating evtag! 

Convert G_Floating to D_ Floating evtgad 

Convert G_Floating to F_Floating evtgtd 

Convert T_Floating to S Floating cvtts® 

Convert S_Floating to T Floating evtst? 

Move From FP Control Register mf_fpcer d_ reg 
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Table A-2: Floating-Point Instruction Set Summary (cont.) 


Instruction Mnemonic Operands 
Move To FP Control Register mt_fpcer s_reg 
Floating No Operation fnop no_operands 


9 In addition to the normal operands that can be specified with this instruction, relocation operands can also be specified 
{see Section 2.6.4). 
s 


Csu 

c, u, uc, Ss, Sc, Su, suc 
e c,m, d, u, uc, um, ud, su, suc, sum, sud, sui, suic, suim, suid 
sv,v 
9 u,v, ve, 8, sc, sv, sve 
c, v, ve, Sv, Svc, Svi, svic, d, vd, svd, svid 
e 


J c,m, d, sui, suic, suim, suid 


See the text immediately preceding Table A-2 for a description of the table 
notes. 


Table A-3: Rounding and Trapping Modes 


Suffix Description 

(no suffix) Normal rounding 

Cc Chopped rounding 

d Dynamic rounding 

m Minus infinity rounding 

s Software completion 

u Underflow trap enabled 

v Integer overflow trap enabled 
i Inexact trap enabled 
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32-Bit Considerations 


The Alpha architecture is a quadword (64-bit) architecture, with limited 
backward compatibility for longword (32-bit) operations. The Alpha 
architecture's design philosophy for longword operations is to use the 
quadword instructions wherever possible and to include specialized longword 
instructions for high-frequency operations. 


B.1 Canonical Form 


Longword operations deal with longword data stored in canonical form in 
quadword registers. The canonical form has the longword data in the low 32 
bits (0-31) of the register, with bit 31 replicated in the high 32 bits (32-63). 
Note that the canonical form is the same for both signed and unsigned 
longword data. 


To create a canonical form operand from longword data, usethe 1d1, 1d1_1, 
or uldil instruction. 


To create a canonical form operand froma constant, use the 1dil instruction. 
The 1dil instruction is a macro instruction that expands into a series of 
instructions, including the 1da and 1dah instructions. 


B.2 Longword Instructions 


The Alpha architecture includes the following longword instructions: 
¢ Load Longword (1d1) 

e Load Longword Locked (1d1_1) 

e Store Longword (st1) 

e Store Longword Conditional (st1_c) 

e Add Longword (addl, addlv) 
¢ Subtract Longword (subl, sublv) 


¢ Multiply Longword (mull, mullv) 
¢ Scaled Longword Add (s4add1, s8add1) 
¢ Scaled Longword Subtract (s4subl, s8sub1) 
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In addition, the assembler provides the following longword macro 
instructions: 


¢ Divide Longword (divl, divlu) 

« Remainder Longword (reml, remlu) 
« Negate Longword (negl, neglv) 

¢« Unaligned Load Longword (uld1) 

¢ Load |mmediate Longword (1di1) 

¢« Absolute Value Longword (abs1) 

¢ Sign-Extension Longword (sext1) 


All longword instructions, with the exception of st1 and st1_c, generate 
results in canonical form. 


All longword instructions that have source operands produce correct results, 
regardless of whether the data items in the source registers are in canonical 
form. 


See Chapter 3 for a detailed description of the longword instructions. 


B.3 Quadword Instructions for Longword Operations 
The following quadword instructions, if presented with two canonical 
longword operands, produce a canonical longword result: 
¢ Logical AND (and) 
¢« Logical OR (bis) 

e Logical Exclusive OR (xor) 
¢ Logical OR NOT (ornot) 

¢ Logical Equivalence (eqv) 

¢ Conditional Move (cmovxx) 
¢ Compare (cmpxx) 

¢ Conditional Branch (bxx) 

¢ Arithmetic Shift Right (sra) 


Note that these instructions, unlike the |longword instructions, must have 
operands in canonical form to produce correct results. 


See Chapter 3 for a detailed description of the quadword instructions. 


B-—2 32-Bit Considerations 


B.4 Logical Shift Instructions 


Noinstructions, either machine or macro, exist for performing logical shifts 
on canonical longwords. 


To perform a logical shift left, use the following instruction sequence: 


sll S$rx, xx, $ry # noncanonical result 
addl Sry, 0, $ry # Sign-extend bit-31 


To perform a logical shift right, use the following instruction sequence: 
zap $rx, Oxf0O, Sry # noncanonical result 
srl Sry, xx, $ry # if xx >= 1, bring in zeros 


addl Sry, 0, $ry # Sign-extend bit-31 


Note that the add1 instruction is not needed if the shift count in the previous 
sequence is guaranteed to be nonzero. 


B.5 Conversions to Quadword 


A signed longword value in canonical formis alsoa proper signed quadword 
value and no conversions are needed. 


An unsigned longword value in canonical form is not a proper unsigned 
quadword value. To convert an unsigned longword to a quadword, use the 
following instruction sequence: 


zap S$rx, Oxf0O, Sry # clear bits 32-63 


B.6 Conversions to Longword 


To convert a quadword value to either a signed or unsigned longword, use 
the following instruction sequence: 


addl $rx, 0, $ry # Sign-extend bit-31 
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Basic Machine Definition 


The assembly language instructions described in this manual area superset 
of the actual machine-code instructions. Generally, the assembly language 
instructions match the machine-code instructions; however, in some cases 
the assembly language instructions are macros that generate more than one 
machine-code instruction (the division instructions in assembly language 
are examples). This appendix describes the assembly language instructions 
that generate more than one machine-code instruction. 


You can, in most instances, consider the assembly language instructions as 
machine-code instructions; however, for routines that require tight coding 
for performance reasons, you must be aware of the assembly language 
instructions that generate more than one machine-code instruction. 


C.1 Implicit Register Use 


Register $28 (Sat) is reserved as a temporary register for use by the 
assembler. 


Some assembly language instructions require additional temporary 
registers. For these instructions, the assembler uses one or more of the 
general-purpose temporary registers (to — t12). The following table lists 
the instructions that require additional temporary registers and the specific 
registers that they use: 


Instruction Registers Used 

ldb AT,t9 

1ldbu AT,t94 

ldw AT,t9 

ldwu AT,t9? 

stb AT,t9,t109 

stw AT,t9,t10? 

ustw AT,t9,t10,t11,t12 
ustl AT,t9,t10,t11,t12 
ustq AT,t9,t10,t11,t12 
uldw AT,t9,t10 
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Instruction Registers Used 
uldwu AT,t9,t10 

uldl AT,t9,t10 

uldg AT,t9,t10 

divl AT, €9;,010,t11,t12 
divg AT,t9,t10,t11,t12 
divlu AT,t9,t10,t11,t12 
divqu AT,t9,t10,t11,t12 
reml AT;,.E9 610; t11 ,;t12 
remq AT,t9,t10,t11,t12 
remlu AT,t9,t10,t11,t12 
remqu AT,t9,t10,t11,t12 


9 Use of registers depends on the setting of the . arch directive or the -arch flag on the cc command line. 


The registers that equate to the software names (from regdef .h) in the 
preceding table are as follows: 


Software Name Register 

AT $28 or Sat 
tg $23 

t10 $24 

joalta $25 

t12 or pv $27 


Note 


The div and rem instructions destroy the contents of t12 only if 
the third operand is a register other than t12. See Section C.5 
for more details. 


C.2 Addresses 


If you use an address as an operand and it references a data item that does 
not have an absolute address in the range -32768 to 32767, the assembler 
may generate a machine-code instruction to load the address of the data 
(from the literal address section) into sat. 


The assembler’s 1dgp (load global pointer) instruction generates an 1da and 
1dah instruction. The assembler requires the 1dgp instruction because 
ldgp couples relocation information with the instruction. 
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C.3 Immediate Values 


If you use an immediate value as an operand and the immediate value falls 
outside the range -32768 to 32767 for the 1dil and 1dig instructions or 
the range 0 — 255 for other instructions, multiple machine instructions are 
generated to load the immediate value into the destination register or Sat. 


C.4 Load and Store Instructions 


On most processors that implement the Alpha architecture, loading and 
storing unaligned data or data less than 32 bits is done with multiple 
machine-code instructions. Except on EV56 Alpha processors, the following 
assembler instructions generate multiple machine-code instructions: 


¢« Load Byte (1db) 

¢« Load Byte Unsigned (1dbu) 

¢« Load Word (1dw) 

¢« Load Word Unsigned (1dwu) 

¢« Unaligned Load Word (uldw) 

¢« Unaligned Load Word Unsigned (uldwu) 
¢ Unaligned Load Longword (u1d1) 
¢ Unaligned Load Quadword (uldq) 
e Store Byte (stb) 

e Store Word (stw) 

¢ Unaligned Store Word (ustw) 

¢« Unaligned Store Longword (ust1) 
¢« Unaligned Store Quadword (ustq) 


Signed loads may require one more instruction than an unsigned load. 


On EV56 Alpha processors, the following instructions from the preceding list 
generate a single instruction: 


e« Load Byte Unsigned (1dbu) 
¢« Load Word Unsigned (1dwu) 
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¢« Store Byte (stb) 
¢« Store Word (stw) 


C.5 Integer Arithmetic Instructions 


Multiply operations using constant powers of two are turned into s11 or 
scaled add instructions. 


There are no machine instructions for performing integer division (div1, 
divlu, divg, and divqu) or remainder operations (rem1, remlu, remq, 
and remqu). The machine instructions generated for these assembler 
instructions depend on the operands specified on the instructions. 


Division and remainder operations involving constant values are replaced 
by an instruction sequence that depends on the data type of the numerator 
and the value of the constant. 


Division and remainder operations involving nonconstant values are 
replaced with a procedure call to a library routine to perform the operation. 
The library routines are in the C run-time library (libc). The library 
routines use a nonstandard parameter passing mechanism. The first 
operand is passed in register t10 and the second operand is passed in t11. 
The result is returned in t12. If the operands specified are other than 
those just described, the assembler moves them to the correct registers. 
The library routines expect the return address in t9; therefore, a routine 
that uses divide instructions does not need to save register ra just because 
it uses divide instructions. 


The abs1 and absgq (absolute value) instructions generate two machine 
instructions. 


C.6 Floating-Point Load Immediate Instructions 


There are no floating-point instructions that accept an immediate value 
(except for 0.0). Whenever the assembler encounters a floating-point |oad 
immediate instruction, the immediate value is stored in the data section and 
a load instruction is generated to load the value. 


C.7 One-to-One Instruction Mappings 


c-4 


Some assembler instructions generate single machine instructions. Such 
assembler instructions are sometimes referred to as pseudoinstructions. 
The following table lists these assembler instructions and their equivalent 
machine instructions: 
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Assembler Instruction 


Machine Instruction 


andnot 
clr 
fabs 
fclr 
fmov 
fneg 
fnop 
mov 
mov 
negf 
negfs 
negg 
neggs 
negl 
neglv 
negq 
negqv 
negs 
negssu 
negt 
negtsu 
nop 
not 

or 
sextl 
unop 


xornot 


S$rx,$ry,$rz 
Srx 

Six, $fy 

Six 

Six, $fy 
Six, $fy 


$rx,$ry 


val_immed, $rx 


$fix,$fy 
$fix,$fy 
$fix,$fy 
$fix,$fy 
$rx,$ry 
$rx,$ry 
$rx,$ry 
$rx,$ry 
$fix,$fy 
$fix,$fy 
$fix,$fy 
$f£x,$fy 


$rx,$ry 
$rx, $ry,$rz 


$rx, Sry 


$rx,$ry,$rz 


bic 
bis 
cpys 
cpys 
cpys 
cpysn 
cpys 
bis 
bis 
subf 
subfs 
subg 
subgs 
subl 
sublv 
subg 
subqv 
subs 
subssu 
subt 
subtsu 
bis 
ornot 
bis 
addl 
ldq_u 


eqv 


Srx, $ry,$rz 
$31,$31,$rx 
$£31,$fx,S$fy 
$£31,$£31,$f£x 
Sfx, $fx,$fy 
Sfx, $fx,$fy 
$£31,$£31,$f31 
$rx,$rx,$ry 
$31, val_immed, $rx 
$£31,$fx,Sfy 
$£31,$fx,S$fy 
$£31,$fx,$fy 
$£31,$fx,S$fy 
$31,$rx, $ry 
$31, $rx, $ry 
$31, $rx, $ry 
$31, $rx, $ry 
$£31,$fx,S$fy 
$£31,$fx,S$fy 
$£31,$fx,S$fy 
$£31,$fx,S$fy 
$31,$31,$31 
$31,$rx,$ry 
$rx,$ry,$rz 
$rx,0,$ry 
$31,0($sp) 
Srx,$ry,$rz 
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PALcode Instruction Summaries 


This appendix summarizes the Privileged Architecture Library (PAL code) 
instructions that are required to support an Alpha system. 


By induding the file pal .h (Use #include <alpha/pal.hs>) in your 
assembly language program, you can use the symbolic names for the 
PAL code instructions. 


D.1 Unprivileged PALcode Instructions 


Table D-1 describes the unprivileged PAL code instructions. 


Table D-1: Unprivileged PALcode Instructions 


Symbolic Name 


Number Operation and Description 


PAL _bpt 


PAL bugchk 


PAL callsys 


PAL gentrap 


PAL rdunigq 


PAL wrunigq 


0x80 


Ox81 


0x83 


Oxaa 


Ox86 


Ox9e 


Ox9f 


Break Point Trap — switches mode to kernel 
mode, builds a stack frame on the kernel stack, 
and dispatches to the breakpoint code. 


Bugcheck — switches mode to kernel mode, 
builds a stack frame on the kernel stack, and 
dispatches to the breakpoint code. 


System call — switches mode to kernel mode, 
builds a callsys stack frame, and dispatches 
to the system call code. 


Generate Trap — switches mode to kerne, 
builds a stack frame on the kernel stack, and 
dispatches to the gentrap code. 


|-Stream Memory Barrier — makes the |-cache 
coherent with main memory. 


Read Unique — returns the contents of 
the process unique register. 


Write Unique — writes the process 
unique register. 


D.2 Privileged PALcode Instructions 


The privileged PAL code instructions can be called only from kernel mode. 
They provide an interface to control the privileged state of the machine. 
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Table D-2 describes the privileged PAL code instructions. 


Table D-2: Privileged PALcode Instructions 


Symbolic Name 


Number Operation and Description 


PAL halt 


PAI 


PAI 


PAI 


PAI 


L rdps 


L rdusp 


L rdval 


L rtsys 


PAL rti 


PAL swpctx 


PAI 


PAI 


PAI 


PAI 


PAI 


PAI 
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L swpipl 


Lb tbi 


L whami 


L wrfen 


L_ wrkgp 


L wrusp 


0x00 


0x36 


Ox3a 


0x32 


Ox3d 


Ox3f 


0x30 


0x35 


0x33 


Ox3c 


Ox2b 


0x37 


0x38 


Halt Processor — stops normal instruction 
processing. Depending on the halt action 
setting, the processor can either enter console 
mode or the restart sequence. 


Read Process Status — return the current 
process status. 


Read User Stack Pointer — reads the user stack 
pointer while in kernel mode and returns it. 


Read System Value — reads a 64-bit per-processor 
value and returns it. 


Return from System Call — pops the return 
address, the user stack pointer, and the user 
global pointer from the kernel stack. It then 
saves the kernel stack pointer, sets mode to 
user mode, enables interrupts, and jumps to 
the address popped off the stack. 


Return from Trap, Fault, or Interrupt — pops 
certain registers from the kernel stack. If the 
new mode is user mode, the kernel stack is 
saved and the user stack is restored. 


Swap Privileged Context — saves the current 
process data in the current process control 
block (PCB). Then it switches to the PCB and 
loads the new process context. 


Swap IPL — returns the current IPL value 
and sets the IPL. 


TB Invalidate — removes entries from the 
instruction and data translation buffers when 
the mapping entries change. 


Who Am | — returns the process number 
for the current processor. The processor 
number is in the range 0 to the number of 
processors minus one (0..numproc-1) that can 
be configured into the system. 


Write Floating-Point Enable — writes a bit to 
the floating-point enable register. 


Write Kernel Global Pointer — writes the kernel 
global pointer internal register. 


Write User Stack Pointer — writes a value to the 
user stack pointer while in kernel mode. 


Table D-2: Privileged PALcode Instructions (cont.) 


Symbolic Name 


Number Operation and Description 


PAI 


PAI 


L wrval 


L wrvptptr 


0x31 


Ox2d 


Write System Value — writes a 64-bit 
per-processor value. 


Write Virtual Page Table Pointer — writes a 
pointer to the virtual page table pointer (vptptr). 
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absl instruction, 3-8, 3-10 
absq instruction, 3-8, 3-10 
addf instruction, 4-11 
addg instruction, 4-11 
addl instruction, 3-10 
addlv instruction, 3-9 
addq instruction, 3-9 
addqv instruction, 3-9, 3-11 
address formats, 2-12 
addresses 
special handling, C-2 
adds instruction, 4-12 
addt instruction, 4-12 
.aent directive, 5-2 
.align directive, 5-3 
amask instruction, 3-24, 3-25 
and instruction, 3-15 
andnot instruction, 3-16 
earch directive, 5-3 
arithmetic instructions 
floating-point instruction set, 4-10 
main instruction set, 3-8 
.ascii directive, 5-3 
easciiz directive, 5-3 
assembler directives, 5-1 


Index 


byte ordering, 1-2 

bis instruction, 3-15 

blbc instruction, 3-19 

blbs instruction, 3-19 

ble instruction, 3-19 

blt instruction, 3-19 

bne instruction, 3-19 

br instruction, 3-20 

bsr instruction, 3-20 

.bss section, 6-4 

.byte directive, 5-4 

byte ordering 
big-endian, 1-2 
littleendian, 1-2 

byte-manipulation instructions 
main instruction set, 3-21 


Cc 


backslash escape character, 2-3 
beg instruction, 3-19 

bge instruction, 3-19 

bgt instruction, 3-20 

bic instruction, 3-15 
big-endian 


C programs 

-S compilation option, 6-14 
call_pal instruction, 3-24, 3-25 
calling conventions, 6-1 
chopped rounding (IEEE), 4-6 
chopped rounding (VAX), 4-6 
dr instruction, 3-8 
cmovegq instruction, 3-18 
cmovge instruction, 3-18 
cmovgt instruction, 3-18 
cmovibc instruction, 3-18 
cmovibs instruction, 3-18 
cmovie instruction, 3-18 
cmovit instruction, 3-18 
cmovne instruction, 3-18 
cmpbge instruction, 3-21 
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cmpeg instruction, 3-17 
cmpgeq instruction, 4-14 
cmpgle instruction, 4-14 
cmpg|t instruction, 4-13 
cmple instruction, 3-17 
cmplt instruction, 3-17 
cmpteg instruction, 4-13 
cmptle instruction, 4-13 
cmptit instruction, 4-13 
cmptun instruction, 4-13 
cmpule instruction, 3-17 
cmpult instruction, 3-17 
code optimization, 6-1 
.comm directive, 5-4 
comments, 2-1 
compilation options 

-S option, 6-14 
constant 

floating-point, 2-2 

scalar, 2-2 

string, 2-3 
control instructions 

floating-point instruction set, 4-15 

main instruction set, 3-18 
counters, 6-4 
cpys instruction, 4-14, 4-15 
cpyse instruction, 4-14, 4-15 
cpysn instruction, 4-14, 4-15 
ctlz instruction, 3-24, 3-26 
ctpop instruction, 3-25 
cttz instruction, 3-25 
cvtdg instruction, 4-11, 4-13 
evtgd instruction, 4-11, 4-13 
cvtof instruction, 4-11, 4-13 
cvtgq instruction, 4-11 
cvtlq instruction, 4-12 
cvtof instruction, 4-11, 4-13 
cvtag instruction, 4-11, 4-13 
cvtql instruction, 4-11 
cvtgs instruction, 4-11, 4-13 
cvtqt instruction, 4-11, 4-13 
cvtst instruction, 4-11, 4-13 
cvttq instruction, 4-12 
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cvtts instruction, 4-11, 4-13 


D 


.d_floating directive, 5-4 
.data directive, 5-4 
data type 

types supported, 2-10 
directive 

assembler directives, 5-1 
divf instruction, 4-11 
divg instruction, 4-12 
divl instruction, 3-9, 3-12 
divlu instruction, 3-9, 3-12 
divq instruction, 3-9, 3-13 
divqu instruction, 3-9, 3-13 
divs instruction, 4-12 
divt instruction, 4-11 
double directive, 5-4 
dynamic rounding mode, 4-3 


E 


.edata directive, 5-5 
.eflag directive, 5-5 
.end directive, 5-5 
.endr directive, 5-5 
.ent directive, 5-5 
eqv instruction, 3-16 
.err directive, 5-5 
escape character, backslash, 2-3 
exch instruction, 3-24 
exception 
floating-point, 1-5 
main processor, 1-5 
expression operator, 2-8 
expressions 
operator precedence rules, 2-9 
type propagation rules, 2-11 
extbl instruction, 3-21 
.extended directive, 5-6 
.extern directive, 5-6 
extlh instruction, 3-21 


extll instruction, 3-22 
extgh instruction, 3-21 
extql instruction, 3-21 
extwh instruction, 3-21 
extwl instruction, 3-21 


F 


f floating directive, 5-6 
fabs instruction, 4-10, 4-12 
fbeq instruction, 4-16 
fbge instruction, 4-15 
fbgt instruction, 4-15 
fble instruction, 4-16 
fblt instruction, 4-15 
fone instruction, 4-16 
fclr instruction, 4-10, 4-12 
fcmoveq instruction, 4-14, 4-15 
fcmovge instruction, 4-14, 4-15 
fcmovgt instruction, 4-14, 4-15 
fcmovle instruction, 4-14, 4-15 
femovlit instruction, 4-14, 4-15 
fcmovne instruction, 4-14, 4-15 
fetch instruction, 3-24, 3-25 
fetch_m instruction, 3-24, 3-25 
file directive, 5-6 
float directive, 5-6 
floating-point constant, 2-2 
floating-point control register 
(SeeFPCR ) 
floating-point directives 
.d_floating (VAX D_floating), 5-4 
f floating (VAX F_floating), 5-6 
.g floating (VAX G_ floating), 5-7 
.S floating (IEEE single precision), 
5-12 
.t_floating (IEE double precision), 
5-13 
.X_floating (IEE quad precision), 
5-14 
floating-point exception traps, 4-4 


floating-point instruction qualifiers 
rounding mode qualifiers, 4-7 
trapping mode qualifiers, 4-7 

floating-point instruction set, 4-1 

floating-point instructions 
arithmetic instructions, 4-10 
control instructions, 4-15 
load instructions, 4-9 
move instructions, 4-14 
relational instructions, 4-13 
special-purpose instructions, 4-16 
store instructions, 4-9 

floating-point rounding modes, 4-5 

.fmask directive, 5-6 

fmov instruction, 4-14, 4-15 

fneg instruction, 4-10, 4-12 

fnop instruction, 4-16 

FPCR, 4-3 

frame directive, 5-7 


G 


.g floating directive, 5-7 

global offset table 
(SeeGOT ) 

.globl directive, 5-7 

GOT, 6-2 

.gprel32 directive, 5-7 

ident directive, 5-8 


identifier 

syntax, 2-1 
immediate values, C-3 
implicit register use, C-1 
implver instruction, 3-24, 3-25 
infinity 

rounding toward plus or minus 

infinity, 4-6 

insbl instruction, 3-21 
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inslh instruction, 3-21, 3-23 

insll instruction, 3-21, 3-23 

insqh instruction, 3-21, 3-23 

insql instruction, 3-21, 3-23 

instruction qualifiers, floating-point 
rounding mode qualifiers, 4-7 
trapping mode qualifiers, 4-7 

instruction set summaries, A-1 

inswh instruction, 3-21, 3-23 

inswl instruction, 3-21, 3-23 

integer arithmetic instructions, C-4 


J 


jmp instruction, 3-20 
jsr instruction, 3-19 
jsr_coroutine instruction, 3-20 


K 


keyword statement, 2-5 


L 


lab directive, 5-8 

label definition, 2-5 
language interfaces, 6-2 
.lcomm directive, 5-8, 6-4 
Ilda instruction, 3-2, 3-4 
Idah instruction, 3-3, 3-6 
Idb instruction, 3-2, 3-4 
Idbu instruction, 3-2, 3-4 
ldf instruction, 4-9 

Idg instruction, 4-10 

Idgp instruction, 3-3, 3-6 
Idid instruction, 4-9 
Idif instruction, 4-9 
Idig instruction, 4-10 
Idil instruction, 3-3, 3-6 
Idiq instruction, 3-3, 3-6 
Idis instruction, 4-9 

Idit instruction, 4-9 

Id| instruction, 3-2, 3-4t 
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Idl_| instruction, 3-2, 3-5 
Idq instruction, 3-2, 3-5 
Idq_l instruction, 3-2, 3-5 
Idq_u instruction, 3-2, 3-5 
Ids instruction, 4-9 
Idt instruction, 4-9 
Idw instruction, 3-2, 3-4 
Idwu instruction, 3-2, 3-4 
linkage conventions 
examples, 6-10 
general, 6-3 
language interfaces, 6-14 
memory allocation, 6-17 
Jit4 directive, 5-8 
Jit8 directive, 5-9 
ita section, 6-5 
littleendian 
byte ordering, 1-2 
load and store instructions, C-3 
load instructions 
floating-point instruction set, 4-9 
main instruction set, 3-2 
loc directive, 5-9 
logical instructions 
descriptions of, 3-15 
formats, 3-14 
long directive, 5-9 


.mask directive, 5-9 
mb instruction, 3-24, 3-25 
mf_fpcr instruction, 4-16 
minus infinity 
rounding toward (IEEE), 4-6 
mnemonic 
definition, 2-5 
mov instruction, 3-18 
move instructions 
floating-point instruction set, 4-14 
main instruction set, 3-17 
mskbl instruction, 3-21, 3-23 
msklh instruction, 3-21, 3-23 
mskll instruction, 3-21, 3-23 


mskgh instruction, 3-21, 3-23 ornot instruction, 3-16 
mskq instruction, 3-21, 3-23 


mskwh instruction, 3-21, 3-23 Pp 

mskwl instruction, 3-21, 3-23 

mt_fpcr instruction, 4-16 PAL code 

mulf instruction, 4-11 instruction summaries, D-1 
mulg instruction, 4-12 performance 

mull instruction, 3-9, 3-11 optimizing assembly code, 6-1 
mullv instruction, 3-9, 3-11 plus infinity 

mulg instruction, 3-9, 3-11 rounding toward (IEEE), 4-6 
mulqv instruction, 3-9, 3-11 precedence rules 

muls instruction, 4-11 operator evaluation order, 2-9 
mult instruction, 4-12 program model 


memory layout, 6-2 
program optimization, 6-1 
N .prologue directive, 5-10 


negf instruction, 4-10, 4-12 PSCHGO St CUCenS oe A 
negg instruction, 4-10, 4-12 

negl instruction, 3-8, 3-10 Q 

neglv instruction, 3-8, 3-10 
negq instruction, 3-8, 3-10 

negqv instruction, 3-8, 3-10 


.quad directive, 5-10 


negs instruction, 4-10, 4-12 R 

negt instruction, 4-10, 4-12 ; ; 

nop instruction, 3-24, 3-25 -rconst directive, 5-10 

normal rounding (IEEE) .rdata directive, 5-10 
unbiased round to nearest, 4-6 register use, 6-3 

normal rounding (VAX) registers 
biased, 4-5 floating-point, 1-2, 6-4t 

not instruction, 3-14, 3-15 general, 1-1 

null statement, 2-5 integer, 1-1, 6-3 

relational instructions 
fe) floating-point instruction set, 4-13 


main instruction set, 3-16 
relocation operand 

syntax and use, 2-6 
reml instruction, 3-9, 3-13 


operator evaluation order 
precedence rules, 2-9 
operator, expression, 2-8 


optimization remlu instruction, 3-9, 3-13 
optimizing assembly code, 6-1 remq instruction, 3-9, 3-14 
.option directive, 5-9 remqu instruction, 3-9, 3-14 

or instruction, 3-15 .repeat directive, 5-10 


ret instruction, 3-20 
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rounding mode 

chopped rounding (IEEE), 4-6 

chopped rounding (VAX), 4-6 

dynamic rounding qualifier, 4-3 

floating-point instruction qualifiers, 
4-7 

floating-point rounding modes, 4-5 

FPCR control, 4-3 

normal rounding (IEEE, unbiased), 
4-6 


normal rounding (VAX, biased), 4-5 


rounding toward minus infinity 
(IEEE), 4-6 
rounding toward plus infinity 
(IEEE), 4-6 
rpcc instruction, 3-24, 3-25 


S) 
-S compilation option, 6-14 
.S files, 6-14 


.S floating directive, 5-12 
s4addl instruction, 3-9, 3-11 
s4addq instruction, 3-9, 3-11 
sAsubl instruction, 3-9, 3-12 
s4subgq instruction, 3-9, 3-12 
s8addl instruction, 3-9, 3-11 
s8addq instruction, 3-9, 3-11 
s8subl instruction, 3-9, 3-12 
s8subq instruction, 3-9, 3-12 
.save_ra directive, 5-11 
.sbss section, 6-4 
scalar constant, 2-2 
.sdata directive, 5-11 
set directive, 5-11 
sextb instruction, 3-8, 3-10 
sextl instruction, 3-8, 3-10 
sextw instruction, 3-8, 3-10 
shift instructions 

descriptions of, 3-15 

formats, 3-14 
sll instruction, 3-15 
.space directive, 5-12 
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special-purpose instructions 
floating-point instruction set, 4-16 
main instruction set, 3-24 

sra instruction, 3-15 

srl instruction, 3-15 

stack organization, 6-7 

statement, 2-5 

stb instruction, 3-3, 3-6 

stf instruction, 4-10t 

stg instruction, 4-10t 

stl instruction, 3-3, 3-7 

stl_cinstruction, 3-3, 3-7 

store instructions 
floating-point instruction set, 4-9 
main instruction set, 3-2 

stq instruction, 3-3, 3-7 

stq_cinstruction, 3-3, 3-7 

stq_u instruction, 3-3, 3-7 

string constant, 2-3 

.struct directive, 5-12 

sts instruction, 4-10t 

stt instruction, 4-9 

stw instruction, 3-3, 3-7 

subf instruction, 4-11 

subg instruction, 4-12 

subl instruction, 3-9, 3-11 

sublv instruction, 3-9, 3-11 

subg instruction, 3-9, 3-12 

subqv instruction, 3-9, 3-12 

subs instruction, 4-12 

subt instruction, 4-11 

symbolic equate, 5-12 


T 


t_floating directive, 5-13 
text directive, 5-13 
thread local storage 

( See .tls* directives ) 
.tlscomm directive, 5-13 
.tlsdata directive, 5-13 
tlslcomm directive, 5-13 
trapb instruction, 3-24, 3-25 


trapping mode W 


floating-point instruction qualifiers, 


4-7 .weakext directive, 5-14 
tune directive, 5-14 wmb instruction, 3-24, 3-26 
type propagation rules .word directive, 5-14 


in expressions, 2-11 


X 


U 


.X_floating directive, 5-14 
xor instruction, 3-15 
xornot instruction, 3-15 


uldl instruction, 3-3, 3- 

uldq instruction, 3-3, 3- 

uldw instruction, 3-3, 3- 
3 


Z 


6 
6 
3,3-5 

uldwu instruction, 3-3, 3-5 
umulh instruction, 3-9, 3-12 

3-25 zap instruction, 3-21, 3-24 

7 zapnot instruction, 3-21, 3-24 

7 

7 


unop instruction, 3-24, 
ustl instruction, 3-3, 3- 
ustq instruction, 3-3, 3- 
ustw instruction, 3-3, 3- 


V 


.verstamp directive, 5-14 
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How to Order Tru64 UNIX Documentation 


To order Tru64 UNIX documentation in the United States and Canada, call 
800-344-4825. |n other countries, contact your local Compaq subsidiary. 


If you have access to Compaq’s intranet, you can place an order at the following 
Web site: 


http://asmorder.ngo.dec.com/ 


If you need help deciding which documentation best meets your needs, see the Tru64 
UNIX Documentation Overview, which describes the structure and organization of 
the Tru64 UNIX documentation and provides brief overviews of each document. 


The following table provides the order numbers for the Tru64 UNIX operating system 
documentation kits. For additional information about ordering this and related 
documentation, see the Documentation Overview or contact Compaq. 


Name Order Number 
Tru64 UNIX Documentation CD-ROM QA-6ADAA-G8 
Tru64 UNIX Documentation Kit QA-6ADAA-GZ 
End User Documentation Kit QA-6ADAB-GZ 
Startup Documentation Kit QA-6ADAC-GZ 
General User Documentation Kit QA-6ADAD-GZ 
System and Network Management Documentation Kit QA-6ADAE-GZ 
Developer’s Documentation Kit QA-6ADAF -GZ 
Reference Pages Documentation Kit QA-6ADAG-GZ 


TruCluster Server Documentation Kit QA-6BRAA-GZ 


Reader’s Comments 


Tru64 UNIX 
Assembly Language Programmer's Guide 
AA-RH9LB-TE 


Compaq welcomes your comments and suggestions on this manual. Your input will help us to write 
documentation that meets your needs. Please send your suggestions using one of the following methods: 


e« This postage-paid form 
¢ — Internet electronic mail: readers_comment@zk3 .dec.com 
* Fax: (603) 884-0120, Attn: UBPG Publications, ZK 03-3/Y 32 


If you are not using this form, please be sure you include the name of the document, the page number, and 
the product name and version. 


Please rate this manual: 


Excellent Good Fair Poor 


Accuracy (software works as manual says) 
Clarity (easy to understand) 

Organization (structure of subject matter) 
Figures (useful) 

Examples (useful) 

Index (ability to find topic) 

Usability (ability to access information quickly) 


Please list errors you have found in this manual: 


Page Description 


Additional comments or suggestions to improve this manual: 


What version of the software described by this manual are you using? 


Name, title, department 

Mailing address 

Electronic mail 

Telephone 
Date 
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